Syntax
From Media Fragments Working Group Wiki
This page discussed the current (evolving) syntax for representing Media Fragments URI.
Contents
Query versus Fragment
Character | Pro | Cons |
---|---|---|
# | The context of the fragment is kept. | The fragment is not sent to the server. We need to encode the content of the fragment through means outside the URI, e.g. new HTTP header. |
? | The fragment is sent to the server as a query. | A new resource is created. The context of the fragment is lost. |
Decisions
- At the moment, the preference is on the fragment (#) symbol, because it allows to keep the relationship to the main resource and it also fits better with a previous time offset scheme available in RTP/RTSP.
- The URI fragment (or query) symbol will be followed by a set of name-value parameters:
- The primary separator could be the ampersand (&) in the same way that cgi queries are composed or the semi-colon (;). See the results of the questionnaire
- The secondary separator would be the comma (,).
- The WG observes that the comma is sometimes used in SMPTE values, for example to delimiter the thousand separator or to signal drop-frame processing but considers this is a quite local-specific behavior
- The fragment dimensions considered are: temporal, spatial, track and name.
- Every fragment dimension should only be specified once as a parameter - only the first one is to be evaluated.
- We think our projection on the time/space/track axes are commutative, and therefore the parameters can be applied in any order.
Proposed ideas but rejected
- Include a special character after the URI fragment symbol that is not allowed as an identifier in HTML4/5 but allowed in the URI specification and will clearly differentiate this fragment addressing from web page fragment addressing. The group thinks it burdens the syntax for little added-value.
- The values of the dimensions still have to be qualified - e.g. time could be discontinuous section, region can only be one square, track can be a set of values, and name can only be one identifier.
Proposed Syntax
This is the result of the brainstorming session we had during our 2nd face-to-face meeting in Ghent (BE), 9-10 of December 2008.
General
The grammar below is written in a pseudo-EBNF syntax.
- 4 dimensions: time, space, track, name
- combination of them: name XOR (time, space, track)
- the reason is that a name has no particular meaning, it can be for example already a specification of a region of an image, or sequence in time, or any combination
- order is not relevant, the processing will always be:
- time or track selection, dealt with at the container level
- spatial clipping if any, dealt with at the codec level (i.e. for a particular track!)
- name often (always?) refers to a temporal selection
- first class separator is '&' and second class separator is ',' (see WG resolution)
- a extreme dumb case would be: select all video tracks of a media resource and then do a spatial clipping, the result of such operation would be unspecified; another extreme case would be to select a temporal interval of a still image, resulting in a still image
Dimensions
- Time:
- t = timerange
- timerange : ["npt" " : "] [clocktime] " , " [clocktime] | format ":" [frametime] "," [frametime] | "clock" ":" [utctime] "," [utctime]
- clocktime : DIGIT+ ["." DIGIT+ ] ["s"] | DIGIT+ ":" 2DIGIT ":" 2DIGIT ["." DIGIT+ ]
- frametime : DIGIT+ ":" 2DIGIT ":" 2DIGIT [":" 2DIGIT ["." 2DIGIT] ]
- utctime : 8DIGIT "T" 6DIGIT [ "." 2DIGIT ] "Z"
- format : "smpte" | "smpte-25" | "smpte-30" | "smpte-30-drop"
- npt is the default format, and can be specified as seconds or hh:mm:ss, with optional (dot-separated) fractional seconds
- other formats are frame based, and always specified as hh:mm:ss, with optional (colon-separated) frame number.
- the intention is that this follows the SMIL and RTSP spec (with the exception of using ":" to separate format name, where those specs use "=").
- Space:
- xywh = [unit " : "] int " , " int " , " int " , " int
- unit = pixel | %
- pixel is the default unit
- origin (0,0) is always the top-left of the screen
- aspect = int":"int
- aspect defines a crop region centered in the center of the current image with the maximum size possible respecting that aspect ratio.
- rationale for not having cm, inch, or pt since we believe media format will not often produce a mapping between these units and pixels
- Track:
- track = " ' " UTF-8 string " ' " (%-escaped or not?)
- see #Character_encoding_of_track_names_and_named_fragments_in_container_formats for more information on character encodings used within container formats
- link with the MAWG to agree on a ROE-like syntax for describing tracks within a media resource (could use the XMP syntax)
- no pre-definition of track names such as audio, video, subtitles, because it can be ambiguous to select the track depending on the container format
- Name:
Examples
Some valid URI fragments:
- http://www.example.com/movie.mov#t=12.33,21.16
- http://www.example.com/movie.mov#t=smpte-25:00:12:33:06,00:21:16:00
- http://www.example.com/movie.mov#xywh=20,20,40,40
- http://www.example.com/movie.mov#aspect=16:9
- http://www.example.com/movie.mov#track='audio1'
- http://www.example.com/movie.mov#id='the%20kiss%20scene'
- http://www.example.com/movie.mov#t=12.33,21.16&xywh=20,20,40,40
- http://www.example.com/movie.mov#track='audio1'&t=12.33,21.16
- http://www.example.com/movie.mov#t=12.33,21.16&xywh=20,20,40,40&track='video2'
- http://www.example.com/movie.mov#t=12.33,21.16&xywh=20,20,40,40&track='video2'
Discussion
- ISSUE 1: "Combining Media Fragment URI with other time-clipping methods". More generally, how to cover the cases where the media fragment is i) encompassing, ii) embedding, iii) disjoint or iv) partially overlapping the boundaries of the other time-clipping method? Specifying a time-clipping method, for example in SMIL, is relative to the (timeline of the) resource. Therefore, if one specifies:
<video clipBegin="5s" clipEnd="15s" src="http://www.example.com/video.mov#t=20,30"/>
- Two possibilities:
- EITHER the media fragment is regarded as out-of-context and the clipping method MUST be done relatively to the media fragment but bound to the media fragment, i.e. the UA plays the video segment between the seconds 25 (=max[20,20+5]) and 30 (=min[30,20+15]).
- OR the media fragment is regarded as in-context and it depends on the application what the best solution is for the UA to do: some UAs may, for example, provide a timeline that encompasses the whole document, not only the time clipping. Implementors SHOULD follow semantics similar in spirit to the previous bullet, but adapted to their situation.
- It is a good idea to not mix units, i.e. cm with pixels for defining a spatial region or npt with smpte values for defining a temporal interval
- ACTION-27: Units recommended to be used for the spatial dimension:
- pixels, percentages (as percentage of width and height)
- recommendation to NOT have cm/in/pt since the media format provides rarely the mapping between pixels and these units
- ACTION-28: Units recommended to be used for the temporal dimension:
- npt, smpte, smpte-25, smpte-30, smpte-30-drop
- ACTION-84, ACTION-97: Jean Pierre Evain suggested during the Barcelona F2F meeting to investigate whether editUnit could be used as another dimension for addressing media fragment. The WG thinks this mechanism should not be supported, for example because editUnits are different for different tracks, see the full explanation.
Formal Grammar
segment = mediasegment / *( pchar / "/" / "?" ) ; augmented fragment ; definition taken from ; rfc3986
npt-sec = 1*DIGIT [ "." *DIGIT ] ; definitions taken npt-hhmmss = npt-hh ":" npt-mm ":" npt-ss [ "." *DIGIT] ; from rfc2326 npt-hh = 1*DIGIT ; any positive number npt-mm = 2DIGIT ; 0-59 npt-ss = 2DIGIT ; 0-59
; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Media Segment ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; mediasegment = namesegment / axissegment axissegment = ( timesegment / spacesegment / tracksegment ) *( "&" ( timesegment / spacesegment / tracksegment ) ) ; ; note that this does not capture the restriction to one kind of fragment ; in the axisfragment definition, unless we list explicitely the 14 cases. ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Time Segment ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; timesegment = timeprefix "=" timeparam timeprefix = %x74 ; "t" timeparam = npttimedef / smptetimedef / clocktimedef npttimedef = [ deftimeformat ":"] ( npttime [ "," npttime ] ) / ( "," npttime ) smptetimedef = smpteformat ":"( frametime [ "," frametime ] ) / ( "," frametime ) clocktimedef = clockformat ":"( clocktime [ "," clocktime ] ) / ( "," clocktime ) deftimeformat = %x6E.70.74 ; "npt" smpteformat = %x73.6D.70.74.65 ; "smpte" / %x73.6D.70.74.65.2D.32.35 ; "smpte-25" / %x73.6D.70.74.65.2D.33.30 ; "smpte-30" / %x73.6D.70.74.65.2D.33.30.2D.64.72.6F.70 ; "smpte-30-drop" clockformat = %x63.6C.6F.63.6B ; "clock" timeunit = %x73 ; "s" npttime = npt-sec / npt-hhmmss frametime = 1*DIGIT ":" 2DIGIT ":" 2DIGIT [ ":" 2DIGIT [ "." 2DIGIT ] ] clocktime = (datetime / walltime / date) datetime = date "T" walltime date = years "-" months "-" days walltime = (HHMM / HHMMSS) tzd HHMM = hours24 ":" minutes HHMMSS = hours24 ":" minutes ":" seconds ["." fraction] years = 4DIGIT months = 2DIGIT ; range from 01 to 12 days = 2DIGIT ; range from 01 to 31 hours24 = 2DIGIT ; range from 00 to 23 minutes = 2DIGIT ; range from 00 to 59 seconds = 2DIGIT ; range from 00 to 59 fraction = 1*DIGIT tzd = "Z" / (("+" / "-") hours24 ":" minutes ) ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Space Segment ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; spacesegment = xywhdef / aspectdef xywhdef = xywhprefix "=" xywhparam aspectdef = aspectprefix "=" aspectparam xywhprefix = %x78.79.77.68 ; "xywh" aspectprefix = %x61.73.70.65.63.74 ; "aspect" xywhparam = [ xywhunit ":" ] 1*DIGIT "," 1*DIGIT "," 1*DIGIT "," 1*DIGIT xywhunit = %x70.69.78.65.6C ; "pixel" / %x70.65.72.63.65.6E.74 ; "percent" aspectparam = 1*DIGIT ":" 1*DIGIT ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Track Segment ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; tracksegment = trackprefix "=" trackparam trackprefix = %x74.72.61.63.6B ; "track" trackparam = utf8string ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Name Segment ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; namesegment = nameprefix "=" nameparam nameprefix = %x69.64 ; "id" nameparam = utf8string ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;; Imported definitions ;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; DIGIT = <DIGIT, defined in rfc4234#3.4> pchar = <pchar, defined in rfc3986> unreserved = <unreserved, defined in rfc3986> pct-encoded = <pct-encoded, defined in rfc3986> utf8string = "'" *( unreserved / pct-encoded ) "'" ; utf-8 character ; encoded URI-style
- Discussion 1: syntax for writing timestamps?
- Jack has first elaborated an RTSP-style syntax
- Conrad has proposedto relax this syntax, making the hours optional in case frames are omitted, and hours mandatory in case frames are present.
- Jack also proposed to use another separator than colon but rather the characters 's', 'm', 'h', closer thus to the Google syntax
- Resolution: Include in the WD a clear paragraph stating we request feedback from the community on this issue!
- Discussion 2: errors in the current syntax
- Include in the WD a paragraph stating that we disallow single quote (') in a utf8string, see Jack message
- Include in the WD a paragraph stating that we disallow sub-delims (e.g. &) in a utf8string, see Jack message
- Spell 'percent' instead of the '%' character, see Jack message (DONE)
point 1 and 2 of Discussion 2 done by changing the production of utf8string to use only unreserved and pct-encoded from rfc3986
Character encoding of track names and named fragments in container formats
Container format | Track name | Named fragment |
---|---|---|
MP4 |
|
|
MOV |
|
|
3GP |
|
|
MPEG-21 File Format |
|
|
Ogg |
|
|
Matroska |
|
|
MXF |
|
Dependent on metadata format |
ASF |
|
|
AVI |
|
No character encodings requirements specified for text streams |
FLV |
|
|
RMFF |
|
? |