Bug 18510 - decodeAudioData should accept a mime-type
decodeAudioData should accept a mime-type
Status: NEW
Product: AudioWG
Classification: Unclassified
Component: Web Audio API
unspecified
PC Windows 3.1
: P2 normal
: TBD
Assigned To: Chris Rogers
public-audio
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-08-09 16:16 UTC by Tony Ross [MSFT]
Modified: 2012-12-04 00:51 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tony Ross [MSFT] 2012-08-09 16:16:41 UTC
The decodeAudioData method on AudioContext is stated to support any of the formats supported by the <audio> element, but unlike the <audio> element it doesn't allow the author to state the format of the audio data (since the ArrayBuffer is already a step removed from the XMLHttpRequest likely used to fetch the data).

We should fix this by adding an (ideally required) contentType argument to decodeAudioData to communicate the format of the audio in the provided ArrayBuffer.
Comment 1 Chris Rogers 2012-08-09 19:18:06 UTC
(In reply to comment #0)
> The decodeAudioData method on AudioContext is stated to support any of the
> formats supported by the <audio> element, but unlike the <audio> element it
> doesn't allow the author to state the format of the audio data (since the
> ArrayBuffer is already a step removed from the XMLHttpRequest likely used to
> fetch the data).
> 
> We should fix this by adding an (ideally required) contentType argument to
> decodeAudioData to communicate the format of the audio in the provided
> ArrayBuffer.

Hi Tony, I'm just trying to get a better idea of why the contentType would be needed.  Are there some particular audio formats which can't be correctly inferred through "sniffing" the audio data, which is what we're doing today.
Comment 2 Tony Ross [MSFT] 2012-08-09 21:42:15 UTC
(In reply to comment #1)
> Hi Tony, I'm just trying to get a better idea of why the contentType would be
> needed.  Are there some particular audio formats which can't be correctly
> inferred through "sniffing" the audio data, which is what we're doing today.

Sure, though some formats are more challenging to distinguish, my bigger concern here is interoperability. Sniffing generally isn't well-specified and, even if it were, introduces a fair amount of additional complexity which can lead to differences in behavior across implementations. This can also become more challenging/complicated as new formats are introduced over time.
Comment 3 Marat Tanalin | tanalin.com 2012-08-09 23:31:53 UTC
(In reply to comment #1)
> Hi Tony, I'm just trying to get a better idea of why the contentType would be
> needed.  Are there some particular audio formats which can't be correctly
> inferred through "sniffing" the audio data, which is what we're doing today.

Need for guessing when we could know exactly is always a bad idea. Intentional abandoning of [at least optional] ability to specify type of a thing explicitly is a bad idea squared.

Imagine a world where there are neither MIME types nor file extensions at all. Nightmare.
Comment 4 Philip Jägenstedt 2012-08-10 08:10:06 UTC
Opera ignores Content-Type for <audio>/<video> and performs sniffing as per <http://mimesniff.spec.whatwg.org/>. We also ignore Content-Type for <track> and sniff for WebVTT using the parser spec. Some background in <http://www.w3.org/html/wg/wiki/ChangeProposals/NoVideoContentType>.

In short, interoperable sniffing is already specified and implemented in Opera, and I think that using a MIME type instead is strictly worse.
Comment 5 Marat Tanalin | tanalin.com 2012-08-10 10:16:49 UTC
(In reply to comment #4)

Sniffing is a cool and useful thing, but it should be a _fallback_ way in case if a resource _cannot_ be parsed according to its MIME type specified explicitly or if MIME type is not specified. Sniffing _in no way_ should be the _only_ way to determine content type.
Comment 6 Philip Jägenstedt 2012-08-10 11:07:51 UTC
Well, I obviously disagree, since sniffing is very deliberately the only way that media resources are identified in Opera.

Since HTTP is not involved here, there's no other authority than this WG to decide how this should work. Unless there are real-world cases where having a type parameter is an improvement, then not having it seems the obvious way to go.
Comment 7 Anthony Bowyer-Lowe 2012-08-10 11:20:28 UTC
Format sniffing is certainly problematic in general, but this usage is for the constricted context of loading audio data.

All the commonly expected audio formats and containers have uniquely identifiable headers containing full data format descriptions. The audio-related MIME types really provide no useful information beyond hinting at the type of header a format parser should initially expect.
Comment 8 Ehsan Akhgari [:ehsan] 2012-08-27 21:19:09 UTC
I would also prefer decodeAudioData to accept a non-optional mimeType argument, and I think we raise an error if the mimeType argument is not supported by the engine.  The reason that I would advocate for this change is that the audio codecs supported by each engine (potentially on each platform) is different, and that makes sniffing even worse than it already is.
Comment 9 Philip Jägenstedt 2012-08-28 07:48:37 UTC
(In reply to comment #8)
> I would also prefer decodeAudioData to accept a non-optional mimeType argument,
> and I think we raise an error if the mimeType argument is not supported by the
> engine.  The reason that I would advocate for this change is that the audio
> codecs supported by each engine (potentially on each platform) is different,
> and that makes sniffing even worse than it already is.

Can you clarify how varying codec support between browsers and platforms make sniffing any harder? AFAIK, none of the formats supported by any browser are hard to sniff or possible to mistake for another supported format.
Comment 10 Ehsan Akhgari [:ehsan] 2012-08-28 14:37:20 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > I would also prefer decodeAudioData to accept a non-optional mimeType argument,
> > and I think we raise an error if the mimeType argument is not supported by the
> > engine.  The reason that I would advocate for this change is that the audio
> > codecs supported by each engine (potentially on each platform) is different,
> > and that makes sniffing even worse than it already is.
> 
> Can you clarify how varying codec support between browsers and platforms make
> sniffing any harder? AFAIK, none of the formats supported by any browser are
> hard to sniff or possible to mistake for another supported format.

They won't make sniffing harder -- it just makes people rely on sniffing potentially multiple audio streams, which means that the browser engine needs to download the first few kilobytes of all of them until it finds one which it can decode.

The idea that I'm going after is very similar to the HTML5 source element.  The source element allows the browser engine to select the encoding that it supports *without* needing to download all of the available resources and sniff them.  If decodeAudioData takes a mimetype argument, then the browser engine can also avoid having to download the resource just to find out that it cannot decode it.
Comment 11 Philip Jägenstedt 2012-08-28 15:08:27 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > (In reply to comment #8)
> > > I would also prefer decodeAudioData to accept a non-optional mimeType argument,
> > > and I think we raise an error if the mimeType argument is not supported by the
> > > engine.  The reason that I would advocate for this change is that the audio
> > > codecs supported by each engine (potentially on each platform) is different,
> > > and that makes sniffing even worse than it already is.
> > 
> > Can you clarify how varying codec support between browsers and platforms make
> > sniffing any harder? AFAIK, none of the formats supported by any browser are
> > hard to sniff or possible to mistake for another supported format.
> 
> They won't make sniffing harder -- it just makes people rely on sniffing
> potentially multiple audio streams, which means that the browser engine needs
> to download the first few kilobytes of all of them until it finds one which it
> can decode.
> 
> The idea that I'm going after is very similar to the HTML5 source element.  The
> source element allows the browser engine to select the encoding that it
> supports *without* needing to download all of the available resources and sniff
> them.  If decodeAudioData takes a mimetype argument, then the browser engine
> can also avoid having to download the resource just to find out that it cannot
> decode it.

decodeAudioData takes an ArrayBuffer, so by the time you can call it you have already downloaded the data. I think that using HTMLMediaElement.canPlayType() to determine which format is supported and only downloading that data ought to be sufficient.
Comment 12 Chris Rogers 2012-10-05 21:21:12 UTC
(In reply to comment #11)
> (In reply to comment #10)
> > (In reply to comment #9)
> > > (In reply to comment #8)
> > > > I would also prefer decodeAudioData to accept a non-optional mimeType argument,
> > > > and I think we raise an error if the mimeType argument is not supported by the
> > > > engine.  The reason that I would advocate for this change is that the audio
> > > > codecs supported by each engine (potentially on each platform) is different,
> > > > and that makes sniffing even worse than it already is.
> > > 
> > > Can you clarify how varying codec support between browsers and platforms make
> > > sniffing any harder? AFAIK, none of the formats supported by any browser are
> > > hard to sniff or possible to mistake for another supported format.
> > 
> > They won't make sniffing harder -- it just makes people rely on sniffing
> > potentially multiple audio streams, which means that the browser engine needs
> > to download the first few kilobytes of all of them until it finds one which it
> > can decode.
> > 
> > The idea that I'm going after is very similar to the HTML5 source element.  The
> > source element allows the browser engine to select the encoding that it
> > supports *without* needing to download all of the available resources and sniff
> > them.  If decodeAudioData takes a mimetype argument, then the browser engine
> > can also avoid having to download the resource just to find out that it cannot
> > decode it.
> 
> decodeAudioData takes an ArrayBuffer, so by the time you can call it you have
> already downloaded the data. I think that using HTMLMediaElement.canPlayType()
> to determine which format is supported and only downloading that data ought to
> be sufficient.

I agree with Philip, and also believe that requiring a MIME-type is much heavier for the developer since they would have to keep track of more information than they do today.  I can see that we might want to consider an optional MIME-type, but would like to see practical use cases drive that.
Comment 13 Ehsan Akhgari [:ehsan] 2012-10-05 21:49:11 UTC
(In reply to comment #12)
> I agree with Philip, and also believe that requiring a MIME-type is much
> heavier for the developer since they would have to keep track of more
> information than they do today.  I can see that we might want to consider an
> optional MIME-type, but would like to see practical use cases drive that.

I think an optional mime-type argument is the worst possible scenario for implementers (as they need to support both cases) and authors (as it will be needlessly confusing to them when they should pass one and when they should not).  :-)

Honestly I have not been able to come up with an argument to answer comment 11 so far. :-)
Comment 14 Chris Rogers 2012-10-05 23:16:08 UTC
Our experience so far with this API out in the field is that developers have never had an issue where MIME-type ever comes into play.

Of course, we still *do* have issues about non-uniform browser support for various audio formats, but that's a whole different can of worms...