Warning:
This wiki has been archived and is now read-only.
UA Server HTTP Communication
Contents
Proposed Solutions
There are two different ways of retrieving the media fragments.
- By using custom HTTP range units, defined along the different axis suitable to describe media fragments. This can be translated in one single roundtrip.
- Via the creation of custom resources that will direct the client on how and where to get all the data needed to construct a fragment. This generally means at least 2 roundtrips.
Unfortunately, no approach is vastly superior, so the solution might be to use both, depending on which problem an Web application is trying to solve. Other concerns to deal with are the cachability of the resource.
- Conrad proposed to use the term optional byte-range redirection instead of handshake. Yves argued that this is not really a byte-range redirection since there is no 3xx code used here. Hence we now refer to byte-range referral.
Proposal described in the FPWD
Single-step partial GET (aka 2-way handshake or 1 roundtrip)
A user requests a media fragment URI, for example using a web browser:
- User → UA (1):
http://www.w3.org/2008/WebVideo/Fragments/media/fragf2f.mp4#t=12,21
UA chops off the fragment and turns it into a HTTP GET request with a time range header:
- UA → Proxy (2) → Origin Server (3):
GET /2008/WebVideo/Fragments/media/fragf2f.mp4 HTTP/1.1 Host: www.w3.org Accept: video/* Range: seconds=12-21
The server has a module for slicing on demand multimedia resources, that is, establishing the relationship between seconds and bytes, extract the bytes corresponding to the requested fragment, and add the new container headers in order to serve a playable resource. The server will then reply with the closest inclusive range in a 206 HTTP response:
- Origin Server → Proxy (4) → UA (5):
HTTP/1.1 206 Partial Content Accept-Ranges: bytes, seconds Content-Length: 3571437 Content-Type: video/mp4 Content-Range: seconds 11.85-21.16/3600
The user agent will then have to skip 0.15s to start playing the multimedia fragment as 12s. Note that the server indicated that the full duration of the resource being served is 3600s.
Dual-step partial GET (aka 4-way handshake or 2 roundtrips)
A user requests a media fragment URI, for example using a web browser:
- User → UA (1):
http://www.w3.org/2008/WebVideo/Fragments/media/fragf2f.mp4#t=12,21
UA chops off fragment and turns it into a HTTP GET request with a time range header:
- UA → Proxy (2) → Origin Server (3):
GET /2008/WebVideo/Fragments/media/fragf2f.mp4 HTTP/1.1 Host: www.w3.org Accept: video/* Range: seconds=12-21 X-Accept-Range-Redirect: bytes
Origin Server converts (maps) time range to bytes range and put all header data, occurring at the beginning of the media resource, that cannot be cached but is required by the UA to receive a fully functional media resource into the HTTP response. It also replies with a X-Accept-TimeURI header that indicates to the client that it has processed the time request and converted to bytes (similarly this could be extended to X-Accept-SpaceURI, X-Accept-TrackURI and X-Accept-NameURI). The message body of this answer contains the control section of fragf2f.mp4#12,21 (if required).
- Origin Server → Proxy (4) → UA (5):
HTTP/1.1 200 OK Accept-Ranges: bytes Content-Type: video/mp4 X-Accept-TimeURI: npt, smpte-25 X-Range-Redirect: bytes 1113724-2082711/4500000 Vary: X-Accept-Range-Redirect Location: http://www.w3.org/2008/WebVideo/Fragments/media/fragf2f.mp4
The UA buffers the data it receives for hand-over to the media subsystem. It then proceeds to put the actual fragment request through:
- UA → Proxy (6) → Origin Server (7):
GET /2008/WebVideo/Fragments/media/fragf2f.mp4 HTTP/1.1 Host: www.w3.org Range: bytes 1113724-2082711
The Origin Server puts the data together and sends it to the UA:
- Origin Server → Proxy (8) → UA (9):
HTTP/1.1 206 Partial Content Accept-Ranges: bytes Content-Type: video/mp4 Content-Range: bytes 1113724-2082711/4500000
The UA hands over the header and video data to the media subsystem and therefore display it to the user (9). Note that the server indicated that the full length of the resource being served is 4500000 bytes.
TO DO
- Describe Silvia's proposal regarding the new headers Accept-TimeURI, Accept-SpaceURI, Accept-TrackURI and Accept-NameURI
- Detail the header and body of the HTTP request and HTTP response in all circumstances
- Propose a mapping (new) header in the HTTP response in the case of the single-step partial GET so that the resource becomes cachable by smart media server
- Range-Mapping: bytes 1234-2345/58588; original=11234-12345/39849384
Conrad's proposal
General approach
As we are dealing with an extension to existing HTTP behaviour we must provide adequate fallbacks in the case that either the client or the server are not familiar with the new mechanisms.
Generally such an HTTP transaction will have the following form, allowing for discovery and fallback:
- A content author publishes a single URI to a media resource.
- A user agent which understands Media Fragment URIs mechanisms will try to use those mechanisms to access it.
- It advertises to the server each mechanism that it is capable of through an appropriate request header.
- The server specifies that a given response uses the specified mechanism through an appropriate response header.
Proposals
We separate the various parts of HTTP media segment handling into three orthogonal proposals.
Examples
A user agent may use some combination of these proposals in order to meet the requirements of specific use cases.
References
- The summary prepared by Silvia
- The list of HTTP codes from Wikipedia
- The list of HTTP headers from Wikipedia
(DEPRECATED) Pro & Cons
- Pro
- 2-ways handshake needs only one roundtrip
- 2-ways handshake allows to extract a spatial region from a Motion JPEG2000 (see format table)
- 2-ways handshake usually achieves what we want without needing HTTP protocol extension for any resource with an intrinsic time->data map such as .mov, .mp4.
- 4-ways handshake allows current web proxies to cache media fragments
- Cons
- In both cases, we create a custom Range unit, namely 'seconds'. We would need to create custom range unit to convey the notion of seconds, pixels, tracks, etc. as well
- 4-ways handshake need two roundtrips
- 4-ways handshake does not allow to extract a spatial region from a Motion JPEG2000. Note though that all other media formats are characterized by a fixed non-cacheable header occurring at the beginning of the media stream and are thus compatible with the 4-way handshake approach
- 2-ways handshake requires specialized 'media'-caches to cache media fragments