Bug 18920 - addSourceBuffer parameter type should be optional
Summary: addSourceBuffer parameter type should be optional
Status: RESOLVED WONTFIX
Alias: None
Product: HTML WG
Classification: Unclassified
Component: Media Source Extensions (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: Adrian Bateman [MSFT]
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 18922
  Show dependency treegraph
 
Reported: 2012-09-19 18:32 UTC by Hadar Weiss
Modified: 2013-04-15 17:18 UTC (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Hadar Weiss 2012-09-19 18:32:38 UTC
In the current spec the initialization of the sourceBuffer requires the developer to specify the exact codec. For example:

mediaSource.addSourceBuffer('video/webm; codecs="vorbis,vp8"');
mediaSource.addSourceBuffer('video/mp4; codecs="avc1.42E01E, mp4a.40.2"');

As a developer, it will make the API much more feasible and easier to use.

Scenario:
A website has a library of various videos in different codecs (as users upload the videos). The video is fetch using XHR and appended with MSE.

Determining the mime type can be made by reading the XHR response, which must contain the real mime type and not just the generic octet stream (requirement for the webserver). Knowing the correct codec is even harder in the js level. In ISO-BMFF there are plenty of variations and the developer needs a js parser to detect which codec is used.

Auto detection of the initialization segment by the browser would make the API much more accessible and developer friendly.
Comment 1 Aaron Colwell (c) 2012-09-19 19:59:14 UTC
(In reply to comment #0)
> In the current spec the initialization of the sourceBuffer requires the
> developer to specify the exact codec. For example:
> 
> mediaSource.addSourceBuffer('video/webm; codecs="vorbis,vp8"');
> mediaSource.addSourceBuffer('video/mp4; codecs="avc1.42E01E, mp4a.40.2"');
> 
> As a developer, it will make the API much more feasible and easier to use.
> 
> Scenario:
> A website has a library of various videos in different codecs (as users upload
> the videos). The video is fetch using XHR and appended with MSE.
> 
> Determining the mime type can be made by reading the XHR response, which must
> contain the real mime type and not just the generic octet stream (requirement
> for the webserver). Knowing the correct codec is even harder in the js level.
> In ISO-BMFF there are plenty of variations and the developer needs a js parser
> to detect which codec is used.
> 
> Auto detection of the initialization segment by the browser would make the API
> much more accessible and developer friendly.

Yes this was done intentionally. The media engine needs to know what the format of the bytestream is to properly parse it. The assumption here is that the application knows what type of content it is trying to append because it needs this information to properly identify where init/media segments are anyways. The application should use some sort of manifest that indicates where the segment boundaries are and include the mimetypes of the streams so that SourceBuffers can be created properly. 

You could do lots of format parsing in JavaScript, but it makes more sense to have the backend do this for you and store it in a manifest.
Comment 2 Philip Jägenstedt 2012-09-20 09:08:05 UTC
(In reply to comment #1)

> Yes this was done intentionally. The media engine needs to know what the format
> of the bytestream is to properly parse it. The assumption here is that the
> application knows what type of content it is trying to append because it needs
> this information to properly identify where init/media segments are anyways.

It seems to me that one can away with ignoring the given type completely by just inspecting the first segment that is appended.
Comment 3 Aaron Colwell (c) 2012-09-21 21:02:10 UTC
(In reply to comment #2)
> (In reply to comment #1)
> 
> > Yes this was done intentionally. The media engine needs to know what the format
> > of the bytestream is to properly parse it. The assumption here is that the
> > application knows what type of content it is trying to append because it needs
> > this information to properly identify where init/media segments are anyways.
> 
> It seems to me that one can away with ignoring the given type completely by
> just inspecting the first segment that is appended.

With the two bytestreams formats that are defined, that is probably true. For WebM you essentially know within the first 32 bytes or so. If more formats get added later this may become more tricky. I'd hate to get into a case like MP3 again where you have to parse a decent amount of data before you can be confident that you are actually dealing with an MP3 stream.

The type in addSourceBuffer() was also indended to provide a way for the UA to get a hint about what the SourceBuffer will be used for so that it can determine whether it has enough resources to support the format. Throwing an exception at addSourceBuffer() allows the MediaSource to signal that it doesn't have enough resources to support the specified format and this error is signalled in a context that doesn't trigger playback to error out.

One could argue that this same exception could be signalled from append(), but it seemed less clean to me. I was thinking of addSourceBuffer() as filling a resource reservation & canPlay() role as well creating endpoints to append media data. I'm up for rethinking this if people really object to the type parameter. I just thought it would make bytestream validation easier if the UA was told exactly what type of bytestream the application intends to append.
Comment 4 Hadar Weiss 2012-09-21 22:45:49 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > (In reply to comment #1)
> > 
> > > Yes this was done intentionally. The media engine needs to know what the format
> > > of the bytestream is to properly parse it. The assumption here is that the
> > > application knows what type of content it is trying to append because it needs
> > > this information to properly identify where init/media segments are anyways.
> > 
> > It seems to me that one can away with ignoring the given type completely by
> > just inspecting the first segment that is appended.
> 
> With the two bytestreams formats that are defined, that is probably true. For
> WebM you essentially know within the first 32 bytes or so. If more formats get
> added later this may become more tricky. I'd hate to get into a case like MP3
> again where you have to parse a decent amount of data before you can be
> confident that you are actually dealing with an MP3 stream.
> 
> The type in addSourceBuffer() was also indended to provide a way for the UA to
> get a hint about what the SourceBuffer will be used for so that it can
> determine whether it has enough resources to support the format. Throwing an
> exception at addSourceBuffer() allows the MediaSource to signal that it doesn't
> have enough resources to support the specified format and this error is
> signalled in a context that doesn't trigger playback to error out.
> 
> One could argue that this same exception could be signalled from append(), but
> it seemed less clean to me. I was thinking of addSourceBuffer() as filling a
> resource reservation & canPlay() role as well creating endpoints to append
> media data. I'm up for rethinking this if people really object to the type
> parameter. I just thought it would make bytestream validation easier if the UA
> was told exactly what type of bytestream the application intends to append.

I think developers will want something as close as possible to the normal video tag: video.src = '<url>' or the <source> tag which also don't requite format.
The browser "automagically" manage to load the video and detect what kind it is.
Maybe the same detection mechanism can be used here as well?
Comment 5 Philip Jägenstedt 2012-09-24 08:00:30 UTC
(In reply to comment #4)
> I think developers will want something as close as possible to the normal video
> tag: video.src = '<url>' or the <source> tag which also don't requite format.
> The browser "automagically" manage to load the video and detect what kind it
> is.
> Maybe the same detection mechanism can be used here as well?

Browsers don't do the same thing here, see the note in the spec close to http://www.whatwg.org/specs/web-apps/current-work/multipage/the-video-element.html#resourceSuspend which begins "This specification does not currently say whether or how to check the MIME types of the media resources, or whether or how to perform file type sniffing using the actual file data."

AFAIK, IE and Firefox currently honor Content-Type. Opera ignores it completely. I'm not entirely sure what WebKit does, I've been told "it's complicated."

I don't particularly mind that there's a type attribute for addSourceBuffer, but all it would do in Opera is the equivalent of canPlayType, after which sniffing would be used to figure out the actual content. In particular the codecs don't really help knowing up front.
Comment 6 Adrian Bateman [MSFT] 2012-10-21 16:16:41 UTC
Discussed during the call. Current consensus to leave the type parameter to allow user agents to fail fast if they choose.

http://www.w3.org/2012/09/25-html-media-minutes.html#item05
Comment 7 Bram van der Kolk 2013-04-14 12:16:11 UTC
We are operating an encrypted cloud storage site, and are using XHR + File API + FileSytem API to do in-browser encryption / decryption for file transfers. We would love to offer our users a preview feature for the various supported video formats and we very much welcome the MediaSource API. However, we do not have access to the user file so we cannot determine the mime type on the server side. It would very much help us if the Media Source API is more intelligent and auto-detects the codec. Format parsing in JavaScript is indeed very cumbersome.
Comment 8 Aaron Colwell 2013-04-15 17:18:18 UTC
(In reply to comment #7)
> We are operating an encrypted cloud storage site, and are using XHR + File
> API + FileSytem API to do in-browser encryption / decryption for file
> transfers. We would love to offer our users a preview feature for the
> various supported video formats and we very much welcome the MediaSource
> API. However, we do not have access to the user file so we cannot determine
> the mime type on the server side. It would very much help us if the Media
> Source API is more intelligent and auto-detects the codec. Format parsing in
> JavaScript is indeed very cumbersome.

For applications to properly work with MSE they need to have details about the layout of the files so they can properly handle seeking. MSE also only works with files that conform to the byte stream specifications outlined in the spec. For example normal non-fragmented MP4 files cannot be played w/ MSE. Only fragmented MP4 files can be used. Even if the mimetype requirement was relaxed, you still would have to know intimate details about the file for your application to work properly with MSE.

As MSE becomes more popular, I'm sure that JavaScript libraries to parse and/or transmux content will become available so it should be easier to achieve your goals. Simply making the mimetype parameter optional won't accomplish what you need though and actually makes it harder for the UA to reject content it can't support w/o interrupting playback.