ImplementationExperiment

From Media Fragments Working Group Wiki
Jump to: navigation, search

This page presents the known attempts to implement parts or the whole Media Fragments specification.

Implementations

1. NinSuna platform

The NinSuna platform implementing the Media Fragment scheme and protocol.

Current status

A high-level overview of the NinSuna platform is visualized above. Before media resources can be accessed through the platform, they need to be ingested first (i.e., by using the media resource ingester). Indexed media resources can be accessed through the HTTP download servlet. The time and track segment axes are supported by the platform (i.e., spatial segments are not supported). A demonstration server has been setup for testing purposes. For the moment, two media resources are present within our platform for testing:

Media segments can be requested by using the query parameter and/or through the HTTP range header. It is important to notice that no transcoding is applied to create the media segments.

Segments via query parameter

When media segments are requested using the query parameter, the URI syntax as specified by the group is fully supported. Note that, for the moment, space and named segments will be ignored by the server. Also, the clock time scheme is not interpreted. A number of working examples:

Segments via the HTTP Range header

Time segments can be requested using the HTTP Range header. For the moment, only the npt scheme is supported. A working example:

 GET /DownloadServlet/mfwg/fragf2f.ogv HTTP/1.1
 Host: ninsuna.elis.ugent.be
 Range: time:npt=5-10
 HTTP/1.x 200 OK
 Accept-Ranges: bytes, time
 Content-Range: time:npt 4.8-10.366666666666667/38.333333333333336
 Content-Type: video/ogg
 Content-Length: 694519
 

Note that, within a Firefox browser, this can easily be tested using the Modify Headers Add-on.

Combining the query parameter and the HTTP range header

The two ways of requesting media segments can also be combined. A working example:

 GET /DownloadServlet/mfwg/fragf2f.ogv?t=20,35 HTTP/1.1
 Host: ninsuna.elis.ugent.be
 Range: time:npt=5-10
 HTTP/1.x 200 OK
 Accept-Ranges: bytes, time
 Content-Range: time:npt 4.800000000000001-10.366666666666667/15.166666666666664
 Content-Type: video/ogg
 Content-Length: 618310

As can be seen in the above example, the media segment resulting from the query parameter has a length of 15.166666666666664 seconds. Requesting a range from 5 to 10 seconds from this media segment results in a new (sub) media segment, but this time with context from the parent resource (i.e., the Content-Range header).

Additional information

More information regarding the NinSuna platform can be found on http://ninsuna.elis.ugent.be and in the following publication:

  • D. Van Deursen et al. NinSuna: a Fully Integrated Platform for Format-independent Multimedia Content Adaptation and Delivery based on Semantic Web Technologies. Multimedia Tools and Applications – Special Issue on Data Semantics for Multimedia Systems, volume 46, numbers 2-3, pages 371–398, January 2010. Available on http://www.springerlink.com/content/461380502m756877/.

2. Implementation with HTML5

Silvia has implemented a demo of temporal URI fragment addressing using the HTML5 video element.


3. Integrated Web page with demos

Silvia created a Web page that uses the HTML5 video element and entry boxes for jumping to time offsets / retrieving time segments at http://www.annodex.net/~silvia/itext/mediafrag_multiple_servers.html . It has both the HTML5 demo in it as well as the NinSuna


Tools that will help create implementations

Grammar checking and JAVA parser

Gstreamer and the python-gst library

  • Guillaume investigates what can be done with the python-gst library (see summary below)
  • Silvia recommends to contact Edward Hervey and look at the PiTiVi documentation
  • Summary: (3 levels of implementation considered)
    1. High level - using direct GST pipeline elements
    2. Middle level - using GST programmatically (python-gst)
    3. Low level - implementing new plugins for GST

Using available GST elements

There are already existing GST plugins to Crop a video or to Seek a specific Start and End position in a Audio or Video media :

  • videocrop: aspectratiocrop: aspectratiocrop and videocrop: Crop
  • debug: navseek: Seek based on left-right arrows

The problem is that, as far as I know, these two plugins are only usable behind a decoder, i.e. using raw YUV or RGB video and PCM audio.

We want to be able to do these operations directly on the media stream without decoding and re-encoding it. To do that, we need to place ourselves directly behind Demuxers elements. Demuxers know about specific Audio or Video files and can parse the structure of the internal compressed media stream, providing information about TIME-BYTE offsets. There are two other things we can do : send events to the pipeline programmatically (2) or create new GST plugins that fit behind demuxers.

Programmatically with Python

Media Fragment along the Time Axis. Depending on the plugin involved in the GST pipeline, it is possible to perform SEEK operations on the stream using the following unit formats :

       'undefined' / 'GST_FORMAT_UNDEFINED', 'default' /
       'GST_FORMAT_DEFAULT', 'bytes' / 'GST_FORMAT_BYTES', 'time' /
       'GST_FORMAT_TIME', 'buffers' / 'GST_FORMAT_BUFFERS', 'percent' /
       'GST_FORMAT_PERCENT'

Also, there are different SeekType and SeekFlags to change the seeking techniques, mode and accuracy. More info at http://gtk2-perl.sourceforge.net/doc/pod/GStreamer/Event/Seek.html It is implemented through the following function:

       event = gst.event_new_seek(Rate, Units,
                                          Flags,
                                          gst.SEEK_TYPE_SET, ClipBegin,
                                          gst.SEEK_TYPE_SET, ClipEnd)
       res = self.player.send_event(event)
       self.player.set_state(gst.STATE_PLAYING).

OR

       gst_element_seek(
       pipeline,  
       Rate, 
       GST_FORMAT_TIME,
       Flags,
       GST_SEEK_TYPE_SET, pos,
       GST_SEEK_TYPE_SET, dur);


Both commands will send the SEEK event to the whole pipeline and some GST elements will be able to handle it. But we might want to be more precise and know exactly which elements can handle seek and what are their capabilities.

For example, can SEEK events be used at the level of DEMUXERs ? source | DEMUXER | sink

           ^
          SEEK event

E.G. Consider the following GST chain for OGG :

filesrc | oggdemux | filesrc | oggdemux |

The questions that must be further investigated are:

  • Which GST elements can handle seek events?
  • What unit formats (time ns (nano seconds), frames, bytes, percents, buffers) are supported by each GST elements?
  • Can all encapsulation specific demuxers handle time and bytes?
  • Can SEEK events be translated higher up in the chain into BYTES on the filesrc SOURCE? Then we could still decode the media, find the actual part of the stream required, make sure a filesrc or uridecodebin in random access can point to the fragment of the media we need, and SINK that MF into a filesink.

Until now I haven't been successful in implementing the GST SEEK events on a variety of media types ; neither directly in C or in Python) with gst.event_new_seek(..) or gst_element_seek(..).

Writing and Compiling new GST plugins

For Video Cropping, filters at BYTE/STREAM levels behind demuxers ?

It is likely that to perform crop operations on a video stream without touching it, we will need specific pluginS to put behind demuxers for each type of video streams. This certainly represents quite a bit of work.

A possibility to investigate : could there be again a pipeline PULL action that request only these bits required for the cropped up video to be pulled and sunk back into a file / pipe ?


MP4Box as media slicing tool for Media Fragment Web servers

MP4Box is a tool to create and edit MP4 files. An interesting feature is that MP4Box does NOT perform audio/video/image transcoding during MP4 creation and editing. In practice, MP4Box can be used to support temporal and track extraction from MP4 files. Suppose we want to obtain fragf2f.mp4#t=3,10&track='1' (within MP4 files, tracks are indicated by a trackID, which is an integer), the following commands need to be executed:

  • temporal extraction:
>mp4box -split-chunk 3:10 fragf2f.mp4
 Adjusting chunk start time to previous random access at 2.40 sec
 Extracting chunk fragf2f_2_9.mp4 - duration 7.60 seconds
  • track extraction (i.e., removal of the tracks that are not selected):
>mp4box -rem 2 fragf2f_2_9.mp4
 Removing track ID 2
 Saving fragf2f_2_9.mp4: 0.500 secs Interleaving

The resulting file (i.e., fragf2f_2_9.mp4) can then be returned to the client.

Two final notes:

  • disadvantage of MP4Box is that it only supports MP4 files. Are similar tools available for other container formats (e.g., Ogg)?
  • a Windows build can for example be found here.


Hurl for testing URL requests and responses

Raphael came across Hurl, http://hurl.it/ and thinks it could be useful to debug our work.