From Media Fragments Working Group Wiki
This page presents the known attempts to implement parts or the whole Media Fragments specification.
1. NinSuna platform
A high-level overview of the NinSuna platform is visualized above. Before media resources can be accessed through the platform, they need to be ingested first (i.e., by using the media resource ingester). Indexed media resources can be accessed through the HTTP download servlet. The time and track segment axes are supported by the platform (i.e., spatial segments are not supported). A demonstration server has been setup for testing purposes. For the moment, two media resources are present within our platform for testing:
- http://ninsuna.elis.ugent.be/DownloadServlet/mfwg/fragf2f_orig.mp4 (track names: mp4_2 for video, mp4_1 for audio)
- http://ninsuna.elis.ugent.be/DownloadServlet/mfwg/fragf2f.ogv (track names: ogg_1 for video, ogg_2 for audio)
Media segments can be requested by using the query parameter and/or through the HTTP range header. It is important to notice that no transcoding is applied to create the media segments.
Segments via query parameter
When media segments are requested using the query parameter, the URI syntax as specified by the group is fully supported. Note that, for the moment, space and named segments will be ignored by the server. Also, the clock time scheme is not interpreted. A number of working examples:
Segments via the HTTP Range header
Time segments can be requested using the HTTP Range header. For the moment, only the npt scheme is supported. A working example:
GET /DownloadServlet/mfwg/fragf2f.ogv HTTP/1.1 Host: ninsuna.elis.ugent.be Range: time:npt=5-10
HTTP/1.x 200 OK Accept-Ranges: bytes, time Content-Range: time:npt 4.8-10.366666666666667/38.333333333333336 Content-Type: video/ogg Content-Length: 694519
Note that, within a Firefox browser, this can easily be tested using the Modify Headers Add-on.
Combining the query parameter and the HTTP range header
The two ways of requesting media segments can also be combined. A working example:
GET /DownloadServlet/mfwg/fragf2f.ogv?t=20,35 HTTP/1.1 Host: ninsuna.elis.ugent.be Range: time:npt=5-10
HTTP/1.x 200 OK Accept-Ranges: bytes, time Content-Range: time:npt 4.800000000000001-10.366666666666667/15.166666666666664 Content-Type: video/ogg Content-Length: 618310
As can be seen in the above example, the media segment resulting from the query parameter has a length of 15.166666666666664 seconds. Requesting a range from 5 to 10 seconds from this media segment results in a new (sub) media segment, but this time with context from the parent resource (i.e., the Content-Range header).
More information regarding the NinSuna platform can be found on http://ninsuna.elis.ugent.be and in the following publication:
- D. Van Deursen et al. NinSuna: a Fully Integrated Platform for Format-independent Multimedia Content Adaptation and Delivery based on Semantic Web Technologies. Multimedia Tools and Applications – Special Issue on Data Semantics for Multimedia Systems, volume 46, numbers 2-3, pages 371–398, January 2010. Available on http://www.springerlink.com/content/461380502m756877/.
2. Implementation with HTML5
Silvia has implemented a demo of temporal URI fragment addressing using the HTML5 video element.
- Description: http://blog.gingertech.net/2009/09/02/demo-of-deep-hyperlinking-into-html5-video/
- Demo: http://new.annodex.net/~silvia/itext/mediafrag.html
3. Integrated Web page with demos
Silvia created a Web page that uses the HTML5 video element and entry boxes for jumping to time offsets / retrieving time segments at http://www.annodex.net/~silvia/itext/mediafrag_multiple_servers.html . It has both the HTML5 demo in it as well as the NinSuna
Tools that will help create implementations
Grammar checking and JAVA parser
- Yves has implemented a Media Fragment URI parser in JAVA, see http://www.w3.org/2008/WebVideo/Fragments/code/grammar/
Gstreamer and the python-gst library
- Guillaume investigates what can be done with the python-gst library (see summary below)
- Silvia recommends to contact Edward Hervey and look at the PiTiVi documentation
- Summary: (3 levels of implementation considered)
- High level - using direct GST pipeline elements
- Middle level - using GST programmatically (python-gst)
- Low level - implementing new plugins for GST
Using available GST elements
There are already existing GST plugins to Crop a video or to Seek a specific Start and End position in a Audio or Video media :
- videocrop: aspectratiocrop: aspectratiocrop and videocrop: Crop
- debug: navseek: Seek based on left-right arrows
The problem is that, as far as I know, these two plugins are only usable behind a decoder, i.e. using raw YUV or RGB video and PCM audio.
We want to be able to do these operations directly on the media stream without decoding and re-encoding it. To do that, we need to place ourselves directly behind Demuxers elements. Demuxers know about specific Audio or Video files and can parse the structure of the internal compressed media stream, providing information about TIME-BYTE offsets. There are two other things we can do : send events to the pipeline programmatically (2) or create new GST plugins that fit behind demuxers.
Programmatically with Python
Media Fragment along the Time Axis. Depending on the plugin involved in the GST pipeline, it is possible to perform SEEK operations on the stream using the following unit formats :
'undefined' / 'GST_FORMAT_UNDEFINED', 'default' / 'GST_FORMAT_DEFAULT', 'bytes' / 'GST_FORMAT_BYTES', 'time' / 'GST_FORMAT_TIME', 'buffers' / 'GST_FORMAT_BUFFERS', 'percent' / 'GST_FORMAT_PERCENT'
Also, there are different SeekType and SeekFlags to change the seeking techniques, mode and accuracy. More info at http://gtk2-perl.sourceforge.net/doc/pod/GStreamer/Event/Seek.html It is implemented through the following function:
event = gst.event_new_seek(Rate, Units, Flags, gst.SEEK_TYPE_SET, ClipBegin, gst.SEEK_TYPE_SET, ClipEnd) res = self.player.send_event(event) self.player.set_state(gst.STATE_PLAYING).
gst_element_seek( pipeline, Rate, GST_FORMAT_TIME, Flags, GST_SEEK_TYPE_SET, pos, GST_SEEK_TYPE_SET, dur);
Both commands will send the SEEK event to the whole pipeline and some GST elements will be able to handle it. But we might want to be more precise and know exactly which elements can handle seek and what are their capabilities.
For example, can SEEK events be used at the level of DEMUXERs ? source | DEMUXER | sink
^ SEEK event
E.G. Consider the following GST chain for OGG :
filesrc | oggdemux | filesrc | oggdemux |
The questions that must be further investigated are:
- Which GST elements can handle seek events?
- What unit formats (time ns (nano seconds), frames, bytes, percents, buffers) are supported by each GST elements?
- Can all encapsulation specific demuxers handle time and bytes?
- Can SEEK events be translated higher up in the chain into BYTES on the filesrc SOURCE? Then we could still decode the media, find the actual part of the stream required, make sure a filesrc or uridecodebin in random access can point to the fragment of the media we need, and SINK that MF into a filesink.
Until now I haven't been successful in implementing the GST SEEK events on a variety of media types ; neither directly in C or in Python) with gst.event_new_seek(..) or gst_element_seek(..).
Writing and Compiling new GST plugins
For Video Cropping, filters at BYTE/STREAM levels behind demuxers ?
It is likely that to perform crop operations on a video stream without touching it, we will need specific pluginS to put behind demuxers for each type of video streams. This certainly represents quite a bit of work.
A possibility to investigate : could there be again a pipeline PULL action that request only these bits required for the cropped up video to be pulled and sunk back into a file / pipe ?
MP4Box as media slicing tool for Media Fragment Web servers
MP4Box is a tool to create and edit MP4 files. An interesting feature is that MP4Box does NOT perform audio/video/image transcoding during MP4 creation and editing. In practice, MP4Box can be used to support temporal and track extraction from MP4 files. Suppose we want to obtain fragf2f.mp4#t=3,10&track='1' (within MP4 files, tracks are indicated by a trackID, which is an integer), the following commands need to be executed:
- temporal extraction:
>mp4box -split-chunk 3:10 fragf2f.mp4 Adjusting chunk start time to previous random access at 2.40 sec Extracting chunk fragf2f_2_9.mp4 - duration 7.60 seconds
- track extraction (i.e., removal of the tracks that are not selected):
>mp4box -rem 2 fragf2f_2_9.mp4 Removing track ID 2 Saving fragf2f_2_9.mp4: 0.500 secs Interleaving
The resulting file (i.e., fragf2f_2_9.mp4) can then be returned to the client.
Two final notes:
- disadvantage of MP4Box is that it only supports MP4 files. Are similar tools available for other container formats (e.g., Ogg)?
- a Windows build can for example be found here.
Hurl for testing URL requests and responses
Raphael came across Hurl, http://hurl.it/ and thinks it could be useful to debug our work.