Re: ACTION-187: extensibility and parsing

On Fri, Sep 24, 2010 at 2:56 PM, Davy Van Deursen
<davy.vandeursen@ugent.be>wrote:

> Citeren Silvia Pfeiffer <silviapfeiffer1@gmail.com>:
> > On Wed, Sep 22, 2010 at 9:12 PM, Philip Jägenstedt <philipj@opera.com
> >wrote:
> >
> >> As request, a short summary of the long standing issue of syntax,
> parsing
> >> and how that relates to extensibility.
> >>
> >> By extensibility I am not primarily talking about 3rd parties extending
> MF,
> >> but about our own possibilities of updating the spec after MF 1.0. For
> the
> >> purpose of discussion, assume that we want to add a dimension for
> filtering
> >> the audio, e.g., freq=300,3000 to keep only the part of the audio that
> >> corresponds (approximately) to human voice (300Hz-3000Hz).
> >>
> >> How will implementations of MF 1.0 handle t=10,500&freq=300,3000 ? This
> is
> >> the core point of disagreement, and the question is really about how MF
> 1.0
> >> parsers should work. Leaving it undefined is not a good option, as the
> >> history clearly shows. Two other options have been on the table:
> >>
> >> 1. Require that parsing follow a strict ABNF syntax like the one we
> have.
> >> Since freq is not part of the MF 1.0 syntax, parsing
> t=10,500&freq=300,3000
> >> will fail and the whole fragment will be ignored, including t=10,500.
> >>
> >> 2. Require that parsing follow an algorithm or a more forgiving ABNF
> >> syntax. The concrete suggestion I've made is that the algorithm or
> syntax
> >> should match how query strings work. That is, a list or key-value pairs
> is
> >> formed by splitting the string on & and =. As a second step, that list
> is
> >> traversed to match the keys against the dimensions and parsed according
> to
> >> the ABNF syntax of each dimension. Crucially, unrecognized/invalid keys
> or
> >> values are ignored. That means that in the above example, the time
> dimension
> >> will keep working even if an unrecognized (to a MF 1.0 implementation)
> freq
> >> dimension is used.
> >>
> >> Note: Neither 1 or 2 are requirements on using any specific
> implementation
> >> technique, only to behave *as if* you are, which still leaves plenty of
> room
> >> for different approaches.
> >>
> >> I strongly favor option number 2, and see these benefits:
> >>
> >> * It works like query strings, just like one would expect from looking
> at
> >> the syntax. The algorithm I've suggested is actually from testing query
> >> string parsing in PHP, ASP, ASP.NET, CGI.pl and JSP, as reported
> earlier
> >> on this list.
> >>
> >> * It's simpler for implementors, as we won't have to implement
> everything
> >> at once. This is likely what's going to happen, as the time dimension is
> >> ready to implement, while the named dimension is still not clear how to
> >> apply to e.g. a WebM or Ogg resource.
> >>
> >> * It's better for extensibility, as adding new dimensions doesn't break
> all
> >> existing implementations. Imagine if adding a new element to HTML would
> >> cause pages to render completely blank in all existing browsers. Not
> even
> >> XHTML is that strict.
> >>
> >> Please comment, we need to reach some kind of consensus on this soon and
> >> move on. If we can agree on what we want, we can then discuss how to
> change
> >> the spec accordingly (algorithm or ABNF, etc...)
> >
> >
> >
> > I also strongly favor option number 2. I don't think anything else makes
> > sense, actually, because we would fail to interoperate with  other
> schemes
> > that use fragments and queries on media resources. Only name-value pairs
> > that do not parse according to our ABNF will be ignored from the
> viewpoint
> > of media fragments. They can be used by the browser or server for other
> > purposes.
>
> Same opinion here, option 1 doesn't seem to make sense. However, should we
> allow any unknown constructions in the URI fragment or
> just key-value pairs with an unknown key? For example:
> - t=10,500&freq=300,3000: should be a valid fragment IMO, as indicated by
> Philip's arguments;
> - t=10,500&foo: is this a valid media fragment? According to Philip's
> parsing algorithm, I think it is not. From an extension point
> of view, disallowing such a construction should be fine since we can
> rewrite this as t=10,500&foo=true if we want to obtain
> key-value pairs. Note that I'm not in favor of allowing other things than
> key-value pairs, I just wanted to point out this case.
>


I agree, it should just be name-value pairs.

Silvia.

Received on Friday, 24 September 2010 05:10:53 UTC