1. Definitions
- Codec
- 
     Refers generically to an instance of AudioDecoder, AudioEncoder, VideoDecoder, or VideoEncoder. 
- Key Chunk
- 
     An encoded chunk that does not depend on any other frames for decoding. Also commonly referred to as a "key frame". 
- Internal Pending Output
- 
     Codec outputs such as VideoFrames that currently reside in the internal pipeline of the underlying codec implementation. The underlying codec implementation may emit new outputs only when new inputs are provided. The underlying codec implementation must emit all outputs in response to a flush.
- Codec System Resources
- 
     Resources including CPU memory, GPU memory, and exclusive handles to specific decoding/encoding hardware that may be allocated by the User Agent as part of codec configuration or generation of AudioDataandVideoFrameobjects. Such resources may be quickly exhausted and should be released immediately when no longer in use.
- Temporal Layer
- 
     A grouping of EncodedVideoChunks whose timestamp cadence produces a particular framerate. SeescalabilityMode.
- Progressive Image
- 
     An image that supports decoding to multiple levels of detail, with lower levels becoming available while the encoded data is not yet fully buffered. 
- Progressive Image Frame Generation
- 
     A generational identifier for a given Progressive Image decoded output. Each successive generation adds additional detail to the decoded output. The mechanism for computing a frame’s generation is implementer defined. 
- Primary Image Track
- 
     An image track that is marked by the given image file as being the default track. The mechanism for indicating a primary track is format defined. 
2. Codec Processing Model
2.1. Background
This section is non-normative.
The codec interfaces defined by the specification are designed such that new
codec tasks may be scheduled while previous tasks are still pending. For
example, web authors may call decode() without waiting for a previous decode() to complete. This is achieved by offloading underlying codec tasks to
a separate thread for parallel execution.
This section describes threading behaviors as they are visible from the perspective of web authors. Implementers may choose to use more or less threads as long as the exernally visible behaviors of blocking and sequencing are maintained as follows.
2.2. Control Thread and Codec Thread
All steps in this specification will run on either a control thread or a codec thread.
The control thread is the thread from which authors will construct a codec and invoke its methods. Invoking a codec’s methods will typically result in the creation of control messages which are later executed on the codec thread. Each global object has a separate control thread.
The codec thread is the thread from which a codec will dequeue control messages and execute their steps. Each codec instance has a separate codec thread. The lifetime of a codec thread matches that of its associated codec instance.
The control thread uses a traditional event loop, as described in [HTML].
The codec thread uses a specialized codec processing loop.
Communication from the control thread to the codec thread is done using control message passing. Communication in the other direction is done using regular event loop tasks.
Each codec instance has a single control message queue that is a queue of control messages.
Queuing a control message means enqueuing the message to a codec’s control message queue. Invoking codec methods will often queue a control message to schedule work.
Running a control message means performing a sequence of steps specified by the method that enqueued the message. The steps of a control message may depend on injected state, supplied by the method that enqueued the message.
Resetting the control message queue means performing these steps:
- 
     For each control message in the control message queue: - 
       If a control message’s injected state includes a promise, reject that promise. 
- 
       Remove the message from the queue. 
 
- 
       
The codec processing loop must run these steps:
- 
     While true: - 
       If the control message queue is empty, continue. 
- 
       Dequeue front message from the control message queue. 
- 
       Run control message steps described by front message. 
 
- 
       
3. AudioDecoder Interface
[Exposed =(Window ,DedicatedWorker )]interface {AudioDecoder constructor (AudioDecoderInit );init readonly attribute CodecState state ;readonly attribute long decodeQueueSize ;undefined configure (AudioDecoderConfig );config undefined decode (EncodedAudioChunk );chunk Promise <undefined >flush ();undefined reset ();undefined close ();static Promise <AudioDecoderSupport >isConfigSupported (AudioDecoderConfig ); };config dictionary {AudioDecoderInit required AudioDataOutputCallback ;output required WebCodecsErrorCallback ; };error callback =AudioDataOutputCallback undefined (AudioData );output 
3.1. Internal Slots
- [[codec implementation]]
- 
     Underlying decoder implementation provided by the User Agent. 
- [[output callback]]
- 
     Callback given at construction for decoded outputs. 
- [[error callback]]
- 
     Callback given at construction for decode errors. 
- [[key chunk required]]
- 
     A boolean indicating that the next chunk passed to decode()must describe a key chunk as indicated by[[type]].
- [[state]]
- 
     The current CodecStateof thisAudioDecoder.
- [[decodeQueueSize]]
- 
     The number of pending decode requests. This number will decrease as the underlying codec is ready to accept new input. 
3.2. Constructors
 AudioDecoder(init)  
   - 
     Let d be a new AudioDecoderobject.
- 
     Assign init.output to [[output callback]].
- 
     Assign init.error to [[error callback]].
- 
     Assign trueto[[key chunk required]].
- 
     Assign "unconfigured"to[[state]]
- 
     Return d. 
3.3. Attributes
-  state, of type CodecState, readonly
- Returns the value of [[state]].
-  decodeQueueSize, of type long, readonly
-  Returns the value of [[decodeQueueSize]].
3.4. Methods
- configure(config)
- 
      Enqueues a control message to configure the audio decoder for decoding
    chunks as described by config. 
     NOTE: This method will trigger a NotSupportedErrorif the user agent does not support config. Authors should first check support by callingisConfigSupported()with config. User agents are not required to support any particular codec type or configuration.When invoked, run these steps: - 
       If config is not a valid AudioDecoderConfig, throw a TypeError.
- 
       If [[state]]is“closed”, throw anInvalidStateError.
- 
       Set [[state]]to"configured".
- 
       Set [[key chunk required]]totrue.
- 
       Queue a control message to configure the decoder with config. 
 Running a control message to configure the decoder means running these steps: - 
       Let supported be the result of running the Check Configuration Support algorithm with config. 
- 
       If supported is true, assign[[codec implementation]]with an implementation supporting config.
- 
       Otherwise, run the Close AudioDecoder algorithm with NotSupportedError.
 
- 
       
- decode(chunk)
- 
      Enqueues a control message to decode the given chunk. 
     When invoked, run these steps: - 
       If [[state]]is not"configured", throw anInvalidStateError.
- 
       If [[key chunk required]]istrue:
- 
         Implementers should inspect the chunk’s [[internal data]]to verify that it is truly a key chunk. If a mismatch is detected, throw aDataError.
- 
         Otherwise, assign falseto[[key chunk required]].
 
- 
       Increment [[decodeQueueSize]].
- 
       Queue a control message to decode the chunk. 
 Running a control message to decode the chunk means performing these steps: - 
       Attempt to use [[codec implementation]]to decode the chunk.
- 
       If decoding results in an error, queue a task on the control thread event loop to run the Close AudioDecoder algorithm with EncodingError.
- 
       Queue a task on the control thread event loop to decrement [[decodeQueueSize]].
- 
       Let decoded outputs be a list of decoded video data outputs emitted by [[codec implementation]].
- 
       If decoded outputs is not empty, queue a task on the control thread event loop to run the Output AudioData algorithm with decoded outputs. 
 
- 
       
- flush()
- 
      Completes all control messages in the control message queue and emits all outputs. 
     When invoked, run these steps: - 
       If [[state]]is not"configured", return a promise rejected withInvalidStateErrorDOMException.
- 
       Set [[key chunk required]]totrue.
- 
       Let promise be a new Promise. 
- 
       Queue a control message to flush the codec with promise. 
- 
       Return promise. 
 Running a control message to flush the codec means performing these steps with promise. - 
       Signal [[codec implementation]]to emit all internal pending outputs.
- 
       Let decoded outputs be a list of decoded audio data outputs emitted by [[codec implementation]].
- 
       If decoded outputs is not empty, queue a task on the control thread event loop to run the Output AudioData algorithm with decoded outputs. 
- 
       Queue a task on the control thread event loop to resolve promise. 
 
- 
       
- reset()
- 
      Immediately resets all state including configuration, control messages in the control message queue, and all pending
    callbacks. 
     When invoked, run the Reset AudioDecoder algorithm. 
- close()
- 
      Immediately aborts all pending work and releases system resources.
    Close is final. 
     When invoked, run the Close AudioDecoder algorithm. 
- isConfigSupported(config)
- 
      Returns a promise indicating whether the provided config is supported by
    the user agent. 
     NOTE: The returned AudioDecoderSupportconfigwill contain only the dictionary members that user agent recognized. Unrecognized dictionary members will be ignored. Authors may detect unrecognized dictionary members by comparingconfigto their provided config.When invoked, run these steps: - 
       If config is not a valid AudioDecoderConfig, return a promise rejected with TypeError.
- 
       Let p be a new Promise. 
- 
       Let checkSupportQueue be the result of starting a new parallel queue. 
- 
       Enqueue the following steps to checkSupportQueue: - 
         Let decoderSupport be a newly constructed AudioDecoderSupport, initialized as follows:- 
           Set configto the result of running the Clone Configuration algorithm with config.
- 
           Set supportedto the result of running the Check Configuration Support algorithm with config.
 
- 
           
- 
         Resolve p with decoderSupport. 
 
- 
         
- 
       Return p. 
 
- 
       
3.5. Algorithms
- Output AudioData (with outputs)
- 
      Run these steps: 
     - 
       For each output in outputs: - 
         Let data be an AudioData, initialized as follows:- 
           Assign falseto[[detached]].
- 
           Let resource be the media resource described by output. 
- 
           Let resourceReference be a reference to resource. 
- 
           Assign resourceReference to [[resource reference]].
- 
           Let timestamp be the [[timestamp]]of theEncodedAudioChunkassociated with output.
- 
           Assign timestamp to [[timestamp]].
- 
           Assign values to [[format]],[[sample rate]],[[number of frames]], and[[number of channels]]as determined by output.
 
- 
           
- 
         Invoke [[output callback]]with data.
 
- 
         
 
- 
       
- Reset AudioDecoder
- 
      Run these steps: 
     - 
       If [[state]]is"closed", throw anInvalidStateError.
- 
       Set [[state]]to"unconfigured".
- 
       Signal [[codec implementation]]to cease producing output for the previous configuration.
- 
       Set [[decodeQueueSize]]to zero.
 
- 
       
- Close AudioDecoder (with error)
- 
      Run these steps: 
     - 
       Run the Reset AudioDecoder algorithm. 
- 
       Set [[state]]to"closed".
- 
       Clear [[codec implementation]]and release associated system resources.
- 
       If error is set, queue a task on the control thread event loop to invoke the [[error callback]]with error.
 
- 
       
4. VideoDecoder Interface
[Exposed =(Window ,DedicatedWorker )]interface {VideoDecoder constructor (VideoDecoderInit );init readonly attribute CodecState state ;readonly attribute long decodeQueueSize ;undefined configure (VideoDecoderConfig );config undefined decode (EncodedVideoChunk );chunk Promise <undefined >flush ();undefined reset ();undefined close ();static Promise <VideoDecoderSupport >isConfigSupported (VideoDecoderConfig ); };config dictionary {VideoDecoderInit required VideoFrameOutputCallback ;output required WebCodecsErrorCallback ; };error callback =VideoFrameOutputCallback undefined (VideoFrame );output 
4.1. Internal Slots
- [[codec implementation]]
- 
     Underlying decoder implementation provided by the User Agent. 
- [[output callback]]
- 
     Callback given at construction for decoded outputs. 
- [[error callback]]
- 
     Callback given at construction for decode errors. 
- [[active decoder config]]
- 
     The VideoDecoderConfigthat is actively applied.
- [[key chunk required]]
- 
     A boolean indicating that the next chunk passed to decode()must describe a key chunk as indicated bytype.
- [[state]]
- 
     The current CodecStateof thisVideoDecoder.
- [[decodeQueueSize]]
- 
     The number of pending decode requests. This number will decrease as the underlying codec is ready to accept new input. 
4.2. Constructors
 VideoDecoder(init)  
   - 
     Let d be a new VideoDecoder object. 
- 
     Assign init.outputto the[[output callback]]internal slot.
- 
     Assign init.errorto the[[error callback]]internal slot.
- 
     Assign trueto[[key chunk required]].
- 
     Assign "unconfigured"to[[state]].
- 
     Return d. 
4.3. Attributes
-  state, of type CodecState, readonly
-  Returns the value of [[state]].
-  decodeQueueSize, of type long, readonly
-  Returns the value of [[decodeQueueSize]].
4.4. Methods
- configure(config)
- 
      Enqueues a control message to configure the video decoder for decoding
    chunks as described by config. 
     NOTE: This method will trigger a NotSupportedErrorif the user agent does not support config. Authors should first check support by callingisConfigSupported()with config. User agents are not required to support any particular codec type or configuration.When invoked, run these steps: - 
       If config is not a valid VideoDecoderConfig, throw a TypeError.
- 
       If [[state]]is“closed”, throw anInvalidStateError.
- 
       Set [[state]]to"configured".
- 
       Set [[key chunk required]]totrue.
- 
       Queue a control message to configure the decoder with config. 
 Running a control message to configure the decoder means running these steps: - 
       Let supported be the result of running the Check Configuration Support algorithm with config. 
- 
       If supported is true, assign[[codec implementation]]with an implementation supporting config.
- 
       Otherwise, run the Close VideoDecoder algorithm with NotSupportedErrorand abort these steps.
- 
       Set [[active decoder config]]toconfig.
 
- 
       
- decode(chunk)
- 
      Enqueues a control message to decode the given chunk. 
     NOTE: Authors should call close()on outputVideoFrames immediately when frames are no longer needed. The underlying media resources are owned by theVideoDecoderand failing to release them (or waiting for garbage collection) may cause decoding to stall.When invoked, run these steps: - 
       If [[state]]is not"configured", throw anInvalidStateError.
- 
       If [[key chunk required]]istrue:
- 
         Implementers should inspect the chunk’s [[internal data]]to verify that it is truly a key chunk. If a mismatch is detected, throw aDataError.
- 
         Otherwise, assign falseto[[key chunk required]].
 
- 
       Increment [[decodeQueueSize]].
- 
       Queue a control message to decode the chunk. 
 Running a control message to decode the chunk means performing these steps: - 
       Attempt to use [[codec implementation]]to decode the chunk.
- 
       If decoding results in an error, queue a task on the control thread event loop to run the Close VideoDecoder algorithm with EncodingError.
- 
       Queue a task on the control thread event loop to decrement [[decodeQueueSize]]
- 
       Let decoded outputs be a list of decoded video data outputs emitted by [[codec implementation]].
- 
       If decoded outputs is not empty, queue a task on the control thread event loop to run the Output VideoFrames algorithm with decoded outputs. 
 
- 
       
- flush()
- 
      Completes all control messages in the control message queue and emits all outputs. 
     When invoked, run these steps: - 
       If [[state]]is not"configured", return a promise rejected withInvalidStateErrorDOMException.
- 
       Set [[key chunk required]]totrue.
- 
       Let promise be a new Promise. 
- 
       Queue a control message to flush the codec with promise. 
- 
       Return promise. 
 Running a control message to flush the codec means performing these steps with promise. - 
       Signal [[codec implementation]]to emit all internal pending outputs.
- 
       Let decoded outputs be a list of decoded video data outputs emitted by [[codec implementation]].
- 
       If decoded outputs is not empty, queue a task on the control thread event loop to run the Output VideoFrames algorithm with decoded outputs. 
- 
       Queue a task on the control thread event loop to resolve promise. 
 
- 
       
- reset()
- 
      Immediately resets all state including configuration, control messages in the control message queue, and all pending
    callbacks. 
     When invoked, run the Reset VideoDecoder algorithm. 
- close()
- 
      Immediately aborts all pending work and releases system resources.
    Close is final. 
     When invoked, run the Close VideoDecoder algorithm. 
- isConfigSupported(config)
- 
      Returns a promise indicating whether the provided config is supported by
    the user agent. 
     NOTE: The returned VideoDecoderSupportconfigwill contain only the dictionary members that user agent recognized. Unrecognized dictionary members will be ignored. Authors may detect unrecognized dictionary members by comparingconfigto their provided config.When invoked, run these steps: - 
       If config is not a valid VideoDecoderConfig, return a promise rejected with TypeError.
- 
       Let p be a new Promise. 
- 
       Let checkSupportQueue be the result of starting a new parallel queue. 
- 
       Enqueue the following steps to checkSupportQueue: - 
         Let decoderSupport be a newly constructed VideoDecoderSupport, initialized as follows:- 
           Set configto the result of running the Clone Configuration algorithm with config.
- 
           Set supportedto the result of running the Check Configuration Support algorithm with config.
 
- 
           
- 
         Resolve p with decoderSupport. 
 
- 
         
- 
       Return p. 
 
- 
       
4.5. Algorithms
- Output VideoFrames (with outputs)
- 
      Run these steps: 
     - 
       For each output in outputs: - 
         Let timestamp and duration be the timestampanddurationfrom theEncodedVideoChunkassociated with output.
- 
         Let displayAspectWidth and displayAspectHeight be undefined. 
- 
         If displayAspectWidthanddisplayAspectHeightexist in the[[active decoder config]], assign their values to displayAspectWidth and displayAspectHeight respectively.
- 
         Let frame be the result of running the Create a VideoFrame algorithm with output, timestamp, duration, displayAspectWidth and displayAspectHeight. 
- 
         Invoke [[output callback]]with frame.
 
- 
         
 
- 
       
- Reset VideoDecoder
- 
      Run these steps: 
     - 
       If stateis"closed", throw anInvalidStateError.
- 
       Set stateto"unconfigured".
- 
       Signal [[codec implementation]]to cease producing output for the previous configuration.
- 
       Set [[decodeQueueSize]]to zero.
 
- 
       
- Close VideoDecoder (with error)
- 
      Run these steps: 
     - 
       Run the Reset VideoDecoder algorithm. 
- 
       Set stateto"closed".
- 
       Clear [[codec implementation]]and release associated system resources.
- 
       If error is set, queue a task on the control thread event loop to invoke the [[error callback]]with error.
 
- 
       
5. AudioEncoder Interface
[Exposed =(Window ,DedicatedWorker )]interface {AudioEncoder constructor (AudioEncoderInit );init readonly attribute CodecState state ;readonly attribute long encodeQueueSize ;undefined configure (AudioEncoderConfig );config undefined encode (AudioData );data Promise <undefined >flush ();undefined reset ();undefined close ();static Promise <AudioEncoderSupport >isConfigSupported (AudioEncoderConfig ); };config dictionary {AudioEncoderInit required EncodedAudioChunkOutputCallback ;output required WebCodecsErrorCallback ; };error callback =EncodedAudioChunkOutputCallback undefined (EncodedAudioChunk ,output optional EncodedAudioChunkMetadata = {});metadata 
5.1. Internal Slots
- [[codec implementation]]
- Underlying encoder implementation provided by the User Agent.
- [[output callback]]
- Callback given at construction for encoded outputs.
- [[error callback]]
- Callback given at construction for encode errors.
- [[active encoder config]]
- The AudioEncoderConfigthat is actively applied.
- [[active output config]]
-  The AudioDecoderConfigthat describes how to decode the most recently emittedEncodedAudioChunk.
- [[state]]
-  The current CodecStateof thisAudioEncoder.
- [[encodeQueueSize]]
- The number of pending encode requests. This number will decrease as the underlying codec is ready to accept new input.
5.2. Constructors
 AudioEncoder(init)  
   - 
     Let e be a new AudioEncoder object. 
- 
     Assign init.outputto the[[output callback]]internal slot.
- 
     Assign init.errorto the[[error callback]]internal slot.
- 
     Assign "unconfigured"to[[state]].
- 
     Assign nullto[[active encoder config]].
- 
     Assign nullto[[active output config]].
- 
     Return e. 
5.3. Attributes
-  state, of type CodecState, readonly
- Returns the value of [[state]].
-  encodeQueueSize, of type long, readonly
-  Returns the value of [[encodeQueueSize]].
5.4. Methods
- configure(config)
- 
      Enqueues a control message to configure the audio encoder for
    decoding chunks as described by config. 
     NOTE: This method will trigger a NotSupportedErrorif the user agent does not support config. Authors should first check support by callingisConfigSupported()with config. User agents are not required to support any particular codec type or configuration.When invoked, run these steps: - 
       If config is not a valid AudioEncoderConfig, throw a TypeError.
- 
       If [[state]]is"closed", throw anInvalidStateError.
- 
       Set [[state]]to"configured".
- 
       Queue a control message to configure the encoder using config. 
 Running a control message to configure the encoder means performing these steps: - 
       Let supported be the result of running the Check Configuration Support algorithm with config. 
- 
       If supported is true, assign[[codec implementation]]with an implementation supporting config.
- 
       Otherwise, run the Close AudioEncoder algorithm with NotSupportedErrorand abort these steps.
- 
       Assign config to [[active encoder config]]
 
- 
       
- encode(data)
- 
      Enqueues a control message to encode the given data. 
     When invoked, run these steps: - 
       If the value of data’s [[detached]]internal slot istrue, throw aTypeError.
- 
       If [[state]]is not"configured", throw anInvalidStateError.
- 
       Let dataClone hold the result of running the Clone AudioData algorithm with data. 
- 
       Increment [[encodeQueueSize]].
- 
       Queue a control message to encode dataClone. 
 Running a control message to encode the data means performing these steps. - 
       Attempt to use [[codec implementation]]to encode the media resource described by dataClone.
- 
       If encoding results in an error, queue a task on the control thread event loop to run the Close AudioEncoder algorithm with EncodingError.
- 
       Queue a task on the control thread event loop to decrement [[encodeQueueSize]].
- 
       Let encoded outputs be a list of encoded audio data outputs emitted by [[codec implementation]].
- 
       If encoded outputs is not empty, queue a task on the control thread event loop to run the Output EncodedAudioChunks algorithm with encoded outputs. 
 
- 
       
- flush()
- 
      Completes all control messages in the control message queue and emits all outputs. 
     When invoked, run these steps: - 
       If [[state]]is not"configured", return a promise rejected withInvalidStateErrorDOMException.
- 
       Let promise be a new Promise. 
- 
       Queue a control message to flush the codec with promise. 
- 
       Return promise. 
 Running a control message to flush the codec means performing these steps with promise. - 
       Signal [[codec implementation]]to emit all internal pending outputs.
- 
       Let encoded outputs be a list of encoded audio data outputs emitted by [[codec implementation]].
- 
       If encoded outputs is not empty, queue a task on the control thread event loop to run the Output EncodedAudioChunks algorithm with encoded outputs. 
- 
       Queue a task on the control thread event loop to resolve promise. 
 
- 
       
- reset()
- 
      Immediately resets all state including configuration, control messages in the control message queue, and all pending
    callbacks. 
     When invoked, run the Reset AudioEncoder algorithm. 
- close()
- 
      Immediately aborts all pending work and releases system resources.
    Close is final. 
     When invoked, run the Close AudioEncoder algorithm. 
- isConfigSupported(config)
- 
      Returns a promise indicating whether the provided config is supported by
    the user agent. 
     NOTE: The returned AudioEncoderSupportconfigwill contain only the dictionary members that user agent recognized. Unrecognized dictionary members will be ignored. Authors may detect unrecognized dictionary members by comparingconfigto their provided config.When invoked, run these steps: - 
       If config is not a valid AudioEncoderConfig, return a promise rejected with TypeError.
- 
       Let p be a new Promise. 
- 
       Let checkSupportQueue be the result of starting a new parallel queue. 
- 
       Enqueue the following steps to checkSupportQueue: - 
         Let encoderSupport be a newly constructed AudioEncoderSupport, initialized as follows:- 
           Set configto the result of running the Clone Configuration algorithm with config.
- 
           Set supportedto the result of running the Check Configuration Support algorithm with config.
 
- 
           
- 
         Resolve p with encoderSupport. 
 
- 
         
- 
       Return p. 
 
- 
       
5.5. Algorithms
- Output EncodedAudioChunks (with outputs)
- 
      Run these steps: 
     - 
       For each output in outputs: - 
         Let chunkInit be an EncodedAudioChunkInitwith the following keys:- 
           Let datacontain the encoded audio data from output.
- 
           Let typebe theEncodedAudioChunkTypeof output.
- 
           Let timestampbe thetimestampfrom the AudioData associated with output.
 
- 
           
- 
         Let chunk be a new EncodedAudioChunkconstructed with chunkInit.
- 
         Let chunkMetadata be a new EncodedAudioChunkMetadata.
- 
         Let encoderConfig be the [[active encoder config]].
- 
         Let outputConfig be a new AudioDecoderConfigthat describes output. Initialize outputConfig as follows:
- 
           Assign encoderConfig. sampleRateto outputConfig.sampleRate.
- 
           Assign to encoderConfig. numberOfChannelsto outputConfig.numberOfChannels.
- 
           Assign outputConfig. descriptionwith a sequence of codec specific bytes as determined by the[[codec implementation]]. The user agent must ensure that the provided description could be used to correctly decode output.NOTE: The codec specific requirements for populating the descriptionare described in the [WEBCODECS-CODEC-REGISTRY].
 
- 
         If outputConfig and [[active output config]]are not equal dictionaries:- 
           Assign outputConfig to chunkMetadata. decoderConfig.
- 
           Assign outputConfig to [[active output config]].
 
- 
           
- 
         Invoke [[output callback]]with chunk and chunkMetadata.
 
- 
         
 
- 
       
- Reset AudioEncoder
- 
      Run these steps: 
     - 
       If [[state]]is"closed", throw anInvalidStateError.
- 
       Set [[state]]to"unconfigured".
- 
       Set [[active encoder config]]tonull.
- 
       Set [[active output config]]tonull.
- 
       Signal [[codec implementation]]to cease producing output for the previous configuration.
- 
       Set [[encodeQueueSize]]to zero.
 
- 
       
- Close AudioEncoder (with error)
- 
      Run these steps: 
     - 
       Run the Reset AudioEncoder algorithm. 
- 
       Set [[state]]to"closed".
- 
       Clear [[codec implementation]]and release associated system resources.
- 
       If error is set, queue a task on the control thread event loop invoke the [[error callback]]with error.
 
- 
       
5.6. EncodedAudioChunkMetadata
The following metadata dictionary is emitted by theEncodedVideoChunkOutputCallback alongside an associated EncodedVideoChunk. 
dictionary {EncodedAudioChunkMetadata AudioDecoderConfig decoderConfig ; };
- decoderConfig, of type AudioDecoderConfig
- 
     A AudioDecoderConfigthat authors may use to decode the associatedEncodedAudioChunk.
6. VideoEncoder Interface
[Exposed =(Window ,DedicatedWorker )]interface {VideoEncoder constructor (VideoEncoderInit );init readonly attribute CodecState state ;readonly attribute long encodeQueueSize ;undefined configure (VideoEncoderConfig );config undefined encode (VideoFrame ,frame optional VideoEncoderEncodeOptions = {});options Promise <undefined >flush ();undefined reset ();undefined close ();static Promise <boolean >isConfigSupported (VideoEncoderConfig ); };config dictionary {VideoEncoderInit required EncodedVideoChunkOutputCallback ;output required WebCodecsErrorCallback ; };error callback =EncodedVideoChunkOutputCallback undefined (EncodedVideoChunk ,chunk optional EncodedVideoChunkMetadata = {});metadata 
6.1. Internal Slots
- [[codec implementation]]
- Underlying encoder implementation provided by the User Agent.
- [[output callback]]
- Callback given at construction for encoded outputs.
- [[error callback]]
- Callback given at construction for encode errors.
- [[active encoder config]]
- The VideoEncoderConfigthat is actively applied.
- [[active output config]]
-  The VideoDecoderConfigthat describes how to decode the most recently emittedEncodedVideoChunk.
- [[state]]
-  The current CodecStateof thisVideoEncoder.
- [[encodeQueueSize]]
- The number of pending encode requests. This number will decrease as the underlying codec is ready to accept new input.
6.2. Constructors
 VideoEncoder(init)  
   - 
     Let e be a new VideoEncoder object. 
- 
     Assign init.outputto the[[output callback]]internal slot.
- 
     Assign init.errorto the[[error callback]]internal slot.
- 
     Assign "unconfigured" to [[state]].
- 
     Return e. 
6.3. Attributes
-  state, of type CodecState, readonly
- Returns the value of [[state]].
-  encodeQueueSize, of type long, readonly
-  Returns the value of [[encodeQueueSize]].
6.4. Methods
- configure(config)
- 
      Enqueues a control message to configure the video encoder for
    decoding chunks as described by config. 
     NOTE: This method will trigger a NotSupportedErrorif the user agent does not support config. Authors should first check support by callingisConfigSupported()with config. User agents are not required to support any particular codec type or configuration.When invoked, run these steps: - 
       If config is not a valid VideoEncoderConfig, throw a TypeError.
- 
       If [[state]]is"closed", throw anInvalidStateError.
- 
       Set [[state]]to"configured".
- 
       Queue a control message to configure the encoder using config. 
 Running a control message to configure the encoder means performing these steps: - 
       Let supported be the result of running the Check Configuration Support algorithm with config. 
- 
       If supported is true, assign[[codec implementation]]with an implementation supporting config.
- 
       Otherwise, run the Close VideoEncoder algorithm with NotSupportedErrorand abort these steps.
- 
       Assign config to [[active encoder config]].
 
- 
       
- encode(frame, options)
- 
      Enqueues a control message to encode the given frame. 
     When invoked, run these steps: - 
       If the value of frame’s [[detached]]internal slot istrue, throw aTypeError.
- 
       If [[state]]is not"configured", throw anInvalidStateError.
- 
       Let frameClone hold the result of running the Clone VideoFrame algorithm with frame. 
- 
       Increment [[encodeQueueSize]].
- 
       Queue a control message to encode frameClone. 
 Running a control message to encode the frame means performing these steps. - 
       Attempt to use [[codec implementation]]to encode frameClone according to options.
- 
       If encoding results in an error, queue a task on the control thread event loop to run the Close VideoEncoder algorithm with EncodingError.
- 
       Queue a task on the control thread event loop to decrement [[encodeQueueSize]].
- 
       Let encoded outputs be a list of encoded video data outputs emitted by [[codec implementation]].
- 
       If encoded outputs is not empty, queue a task on the control thread event loop to run the Output EncodedVideoChunks algorithm with encoded outputs. 
 
- 
       
- flush()
- 
      Completes all control messages in the control message queue and emits all outputs. 
     When invoked, run these steps: - 
       If [[state]]is not"configured", return a promise rejected withInvalidStateErrorDOMException.
- 
       Let promise be a new Promise. 
- 
       Queue a control message to flush the codec with promise. 
- 
       Return promise. 
 Running a control message to flush the codec means performing these steps with promise. - 
       Signal [[codec implementation]]to emit all internal pending outputs.
- 
       Let encoded outputs be a list of encoded video data outputs emitted by [[codec implementation]].
- 
       If encoded outputs is not empty, queue a task on the control thread event loop to run the Output EncodedVideoChunks algorithm with encoded outputs. 
- 
       Queue a task on the control thread event loop to resolve promise. 
 
- 
       
- reset()
- 
      Immediately resets all state including configuration, control messages in the control message queue, and all pending
    callbacks. 
     When invoked, run the Reset VideoEncoder algorithm. 
- close()
- 
      Immediately aborts all pending work and releases system resources.
    Close is final. 
     When invoked, run the Close VideoEncoder algorithm. 
- isConfigSupported(config)
- 
      Returns a promise indicating whether the provided config is supported by
    the user agent. 
     NOTE: The returned VideoEncoderSupportconfigwill contain only the dictionary members that user agent recognized. Unrecognized dictionary members will be ignored. Authors may detect unrecognized dictionary members by comparingconfigto their provided config.When invoked, run these steps: - 
       If config is not a valid VideoEncoderConfig, return a promise rejected with TypeError.
- 
       Let p be a new Promise. 
- 
       Let checkSupportQueue be the result of starting a new parallel queue. 
- 
       Enqueue the following steps to checkSupportQueue: - 
         Let encoderSupport be a newly constructed VideoEncoderSupport, initialized as follows:- 
           Set configto the result of running the Clone Configuration algorithm with config.
- 
           Set supportedto the result of running the Check Configuration Support algorithm with config.
 
- 
           
- 
         Resolve p with encoderSupport. 
 
- 
         
- 
       Return p. 
 
- 
       
6.5. Algorithms
- Output EncodedVideoChunks (with outputs)
- 
      Run these steps: 
     - 
       For each output in outputs: - 
         Let chunkInit be an EncodedVideoChunkInitwith the following keys:- 
           Let datacontain the encoded video data from output.
- 
           Let typebe theEncodedVideoChunkTypeof output.
- 
           Let timestampbe the[[timestamp]]from theVideoFrameassociated with output.
- 
           Let durationbe the[[duration]]from theVideoFrameassociated with output.
 
- 
           
- 
         Let chunk be a new EncodedVideoChunkconstructed with chunkInit.
- 
         Let chunkMetadata be a new EncodedVideoChunkMetadata.
- 
         Let encoderConfig be the [[active encoder config]].
- 
         Let outputConfig be a VideoDecoderConfigthat describes output. Initialize outputConfig as follows:- 
           Assign encoderConfig.codectooutputConfig.codec.
- 
           Assign encoderConfig.widthtooutputConfig.cropWidth.
- 
           Assign encoderConfig.heighttooutputConfig.cropHeight.
- 
           Assign encoderConfig.displayWidthtooutputConfig.displayWidth.
- 
           Assign encoderConfig.displayHeighttooutputConfig.displayHeight.
- 
           Assign the remaining keys of outputConfigas determined by[[codec implementation]]. The user agent must ensure that the configuration is completely described such that outputConfig could be used to correctly decode output.NOTE: The codec specific requirements for populating the descriptionare described in the [WEBCODECS-CODEC-REGISTRY].
 
- 
           
- 
         If outputConfig and [[active output config]]are not equal dictionaries:- 
           Assign outputConfig to chunkMetadata. decoderConfig.
- 
           Assign outputConfig to [[active output config]].
 
- 
           
- 
         If encoderConfig. scalabilityModedescribes multiple temporal layers:- 
           Let temporal_layer_id be the zero-based index describing the temporal layer for output. 
- 
           Assign temporal_layer_id to chunkMetadata. temporalLayerId.
 
- 
           
- 
         Invoke [[output callback]]with chunk and chunkMetadata.
 
- 
         
 
- 
       
- Reset VideoEncoder
- 
      Run these steps: 
     - 
       If [[state]]is"closed", throw anInvalidStateError.
- 
       Set [[state]]to"unconfigured".
- 
       Set [[active encoder config]]tonull.
- 
       Set [[active output config]]tonull.
- 
       Signal [[codec implementation]]to cease producing output for the previous configuration.
- 
       Set [[encodeQueueSize]]to zero.
 
- 
       
- Close VideoEncoder (with error)
- 
      Run these steps: 
     - 
       Run the Reset VideoEncoder algorithm. 
- 
       Set [[state]]to"closed".
- 
       Clear [[codec implementation]]and release associated system resources.
- 
       If error is set, queue a task on the control thread event loop invoke the [[error callback]]with error.
 
- 
       
6.6. EncodedVideoChunkMetadata
The following metadata dictionary is emitted by theEncodedVideoChunkOutputCallback alongside an associated EncodedVideoChunk. 
dictionary {EncodedVideoChunkMetadata VideoDecoderConfig decoderConfig ;unsigned long temporalLayerId ; };
- decoderConfig, of type VideoDecoderConfig
- 
     A VideoDecoderConfigthat authors may use to decode the associatedEncodedVideoChunk.
- temporalLayerId, of type unsigned long
- 
     A number that identifies the temporal layer for the associated EncodedVideoChunk.
7. Configurations
7.1. Check Configuration Support (with config)
Run these steps:- 
     If the user agent can provide a codec to support all entries of the config, including applicable default values for keys that are not included, return true.NOTE: The types AudioDecoderConfig,VideoDecoderConfig,AudioEncoderConfig, andVideoEncoderConfigeach define their respective configuration entries and defaults.NOTE: Support for a given configuration may change dynamically if the hardware is altered (e.g. external GPU unplugged) or if required hardware resources are exhausted. User agents should describe support on a best-effort basis given the resources that are available at the time of the query. 
- 
     Otherwise, return false. 
7.2. Clone Configuration (with config)
NOTE: This algorithm will copy only the dictionary members that the user agent recognizes as part of the dictionary type.
Run these steps:
- 
     Let dictType be the type of dictionary config. 
- 
     Let clone be a new empty instance of dictType. 
- 
     For each dictionary member m defined on dictType: 
- 
       If config[m]is a nested dictionary, setclone[m]to the result of recursively running the Clone Configuration algorithm withconfig[m].
- 
       Otherwise, assign the value of config[m]toclone[m].
 
7.3. Signalling Configuration Support
7.3.1. AudioDecoderSupport
dictionary {AudioDecoderSupport boolean supported ;AudioDecoderConfig config ; };
- supported, of type boolean
-  A boolean indicating the whether the corresponding configis supported by the user agent.
- config, of type AudioDecoderConfig
-  An AudioDecoderConfigused by the user agent in determining the value ofsupported.
7.3.2. VideoDecoderSupport
dictionary {VideoDecoderSupport boolean supported ;VideoDecoderConfig config ; };
- supported, of type boolean
-  A boolean indicating the whether the corresponding configis supported by the user agent.
- config, of type VideoDecoderConfig
-  A VideoDecoderConfigused by the user agent in determining the value ofsupported.
7.3.3. AudioEncoderSupport
dictionary {AudioEncoderSupport boolean supported ;AudioEncoderConfig config ; };
- supported, of type boolean
-  A boolean indicating the whether the corresponding configis supported by the user agent.
- config, of type AudioEncoderConfig
-  An AudioEncoderConfigused by the user agent in determining the value ofsupported.
7.3.4. VideoEncoderSupport
dictionary {VideoEncoderSupport boolean supported ;VideoEncoderConfig config ; };
- supported, of type boolean
-  A boolean indicating the whether the corresponding configis supported by the user agent.
- config, of type VideoEncoderConfig
-  A VideoEncoderConfigused by the user agent in determining the value ofsupported.
7.4. Codec String
A codec string describes a given codec format to be used for encoding or decoding.A valid codec string must meet the following conditions.
- 
     Is valid per the relevant codec specification (see examples below). 
- 
     It describes a single codec. 
- 
     It is unambiguous about codec profile and level for codecs that define these concepts. 
NOTE: In other media specifications, codec strings historically accompanied a MIME type as the "codecs=" parameter
    (isTypeSupported(), canPlayType()) [RFC6381]. In this specification, encoded media is not containerized;
    hence, only the value of the codecs parameter is accepted.
The format and semantics for codec strings are defined by codec registrations listed in the [WEBCODECS-CODEC-REGISTRY]. A compliant implementation may support any combination of codec registrations or none at all.
7.5. AudioDecoderConfig
dictionary {AudioDecoderConfig required DOMString codec ; [EnforceRange ]required unsigned long sampleRate ; [EnforceRange ]required unsigned long numberOfChannels ;BufferSource description ; };
To check if an AudioDecoderConfig is a valid AudioDecoderConfig,
    run these steps:
- 
     If codec is not a valid codec string, return false.
- 
     Return true.
- codec, of type DOMString
- Contains a codec string describing the codec.
- sampleRate, of type unsigned long
- The number of frame samples per second.
- numberOfChannels, of type unsigned long
- The number of audio channels.
- description, of type BufferSource
- 
      A sequence of codec specific bytes, commonly known as extradata. 
     NOTE: The registrations in the [WEBCODECS-CODEC-REGISTRY] describe whether/how to populate this sequence, corresponding to the provided codec.
7.6. VideoDecoderConfig
dictionary {VideoDecoderConfig required DOMString codec ;BufferSource description ; [EnforceRange ]unsigned long codedWidth ; [EnforceRange ]unsigned long codedHeight ; [EnforceRange ]unsigned long displayAspectWidth ; [EnforceRange ]unsigned long displayAspectHeight ;HardwareAcceleration hardwareAcceleration = "allow"; };
To check if a VideoDecoderConfig is a valid VideoDecoderConfig,
run these steps:
- 
     If codecis not a valid codec string, returnfalse.
- 
     If one of codedWidthorcodedHeightis provided but the other isn’t, returnfalse.
- 
     If codedWidth= 0 orcodedHeight= 0, returnfalse.
- 
     If one of displayAspectWidthordisplayAspectHeightis provided but the other isn’t, returnfalse.
- 
     If displayAspectWidth= 0 ordisplayAspectHeight= 0, returnfalse.
- 
     Return true.
- codec, of type DOMString
- Contains a codec string describing the codec.
- description, of type BufferSource
- 
      A sequence of codec specific bytes, commonly known as extradata. 
     NOTE: The registrations in the [WEBCODECS-CODEC-REGISTRY] may describe whether/how to populate this sequence, corresponding to the provided codec.
- codedWidth, of type unsigned long
- Width of the VideoFrame in pixels, prior to any cropping or aspect ratio adjustments.
- codedHeight, of type unsigned long
-  Height of the VideoFrame in pixels, prior to any cropping or aspect ratio
        adjustments. 
    NOTE: codedWidthandcodedHeightare used when selecting a[[codec implementation]].
- displayAspectWidth, of type unsigned long
- Horizontal dimension of the VideoFrame’s aspect ratio when displayed.
- displayAspectHeight, of type unsigned long
-  Vertical dimension of the VideoFrame’s aspect ratio when displayed. 
    Note: displayWidthanddisplayHeightcan both be different fromdisplayAspectWidthanddisplayAspectHeight, but they should have identical ratios, after scaling is applied when creating the video frame.
- hardwareAcceleration, of type HardwareAcceleration, defaulting to- "allow"
-  Configures hardware acceleration for this codec. See HardwareAcceleration.
7.7. AudioEncoderConfig
dictionary {AudioEncoderConfig required DOMString codec ; [EnforceRange ]unsigned long sampleRate ; [EnforceRange ]unsigned long numberOfChannels ; [EnforceRange ]unsigned long long bitrate ; };
NOTE: Codec-specific extensions to AudioEncoderConfig may be defined by the
    registrations in the [WEBCODECS-CODEC-REGISTRY].
To check if an AudioEncoderConfig is a valid AudioEncoderConfig,
run these steps:
- 
     If codecis not a valid codec string, returnfalse.
- 
     Return true.
- codec, of type DOMString
- Contains a codec string describing the codec.
- sampleRate, of type unsigned long
- The number of frame samples per second.
- numberOfChannels, of type unsigned long
- The number of audio channels.
- bitrate, of type unsigned long long
- The average bitrate of the encoded audio given in units of bits per second.
7.8. VideoEncoderConfig
dictionary {VideoEncoderConfig required DOMString codec ; [EnforceRange ]unsigned long long bitrate ; [EnforceRange ]required unsigned long width ; [EnforceRange ]required unsigned long height ; [EnforceRange ]unsigned long displayWidth ; [EnforceRange ]unsigned long displayHeight ;HardwareAcceleration hardwareAcceleration = "allow";DOMString scalabilityMode ; };
NOTE: Codec-specific extensions to VideoEncoderConfig may be defined by the
    registrations in the [WEBCODECS-CODEC-REGISTRY].
To check if a VideoEncoderConfig is a valid VideoEncoderConfig,
    run these steps:
- 
     If codecis not a valid codec string, returnfalse.
- 
     If displayWidth= 0 ordisplayHeight= 0, returnfalse.
- 
     Return true.
- codec, of type DOMString
- Contains a codec string describing the codec.
- bitrate, of type unsigned long long
- The average bitrate of the encoded video given in units of bits per second.
- width, of type unsigned long
- 
      The encoded width of output EncodedVideoChunks in pixels, prior to any display aspect ratio adjustments.The encoder must scale any VideoFramewhose[[crop width]]differs from this value.
- height, of type unsigned long
- 
      The encoded height of output EncodedVideoChunks in pixels, prior to any display aspect ratio adjustments.The encoder must scale any VideoFramewhose[[crop height]]differs from this value.
- displayWidth, of type unsigned long
-  The intended display width of output EncodedVideoChunks in pixels. Defaults towidthif not present.
- displayHeight, of type unsigned long
-  The intended display height of output EncodedVideoChunks in pixels. Defaults towidthif not present.
displayWidth or displayHeight that differs from width and height signals
      that chunks should be scaled after decoding to arrive at the final
      display aspect ratio. 
    For many codecs this is merely pass-through information, but some codecs may optionally include display sizing in the bitstream.
- hardwareAcceleration, of type HardwareAcceleration, defaulting to- "allow"
-  Configures hardware acceleration for this codec. See HardwareAcceleration.
- scalabilityMode, of type DOMString
- An encoding scalability mode identifier as defined by [WebRTC-SVC].
7.9. Hardware Acceleration
enum {HardwareAcceleration "allow" ,"deny" ,"require" , };
When supported, hardware acceleration offloads encoding or decoding to specialized hardware.
allow. This gives the user agent flexibility to
  optimize based on its knowledge of the system and configuration. A common
  strategy will be to prioritize hardware acceleration at higher resolutions
  with a fallback to software codecs if hardware acceleration fails. 
    Authors should carefully weigh the tradeoffs when setting a hardware acceleration preference. The precise tradeoffs will be device-specific, but authors should generally expect the following:
- 
      Setting a value of requiremay significantly restrict what configurations are supported. It may occur that the user’s device does not offer acceleration for any codec, or only for the most common profiles of older codecs.
- 
      Hardware acceleration does not simply imply faster encoding / decoding. Hardware acceleration often has higher startup latency but more consistent throughput performance. Acceleration will generally reduce CPU load. 
- 
      For decoding, hardware acceleration is often less robust to inputs that are mislabeled or violate the relevant codec specification. 
- 
      Hardware acceleration will often be more power efficient than purely software based codecs. 
- 
      For lower resolution content, the overhead added by hardware acceleration may yield decreased performance and power efficiency compared to purely software based codecs. 
Given these tradeoffs, a good example of using "require" would be if an author intends to provide their own software based fallback via WebAssembly.
Alternatively, a good example of using "disallow" would be if an author is especially sensitive to the higher startup latency or decreased robustness generally associated with hardware acceleration.
- allow
- Indicates that the user agent may use hardware acceleration if it is available and compatible with other aspects of the codec configuration.
- deny
- 
      Indicates that the user agent must not use hardware acceleration. 
     NOTE: This will cause the configuration to be unsupported on platforms where an unaccelerated codec is unavailable or is incompatible with other aspects of the codec configuration. 
- require
- 
      Indicates that the user agent must use hardware acceleration. 
     NOTE: This will cause the configuration to be unsupported on platforms where an accelerated codec is unavailable or is incompatible with other aspects of the codec configuration. 
7.10. Configuration Equivalence
Two dictionaries are equal dictionaries if they contain the same keys and values. For nested dictionaries, apply this definition recursively.7.11. VideoEncoderEncodeOptions
dictionary {VideoEncoderEncodeOptions boolean keyFrame =false ; };
- keyFrame, of type boolean, defaulting to- false
-  A value of trueindicates that the given frame MUST be encoded as a key frame. A value offalseindicates that the user agent has flexibility to decide whether the frame will be encoded as a key frame.
7.12. CodecState
enum {CodecState "unconfigured" ,"configured" ,"closed" };
- unconfigured
- The codec is not configured for encoding or decoding.
- configured
- A valid configuration has been provided. The codec is ready for encoding or decoding.
- closed
- The codec is no longer usable and underlying system resources have been released.
7.13. WebCodecsErrorCallback
callback =WebCodecsErrorCallback undefined (DOMException );error 
8. Encoded Media Interfaces (Chunks)
These interfaces represent chunks of encoded media.8.1. EncodedAudioChunk Interface
[Exposed =(Window ,DedicatedWorker )]interface {EncodedAudioChunk constructor (EncodedAudioChunkInit );init readonly attribute EncodedAudioChunkType type ;readonly attribute long long timestamp ; // microsecondsreadonly attribute unsigned long duration ; // microsecondsreadonly attribute unsigned long byteLength ;undefined copyTo ([AllowShared ]BufferSource ); };destination dictionary {EncodedAudioChunkInit required EncodedAudioChunkType ; [type EnforceRange ]required long long ; // microsecondstimestamp required BufferSource ; };data enum {EncodedAudioChunkType ,"key" , };"delta" 
8.1.1. Internal Slots
- [[internal data]]
- 
     An array of bytes representing the encoded chunk data. 
- [[type]]
- 
     Describes whether the chunk is a key chunk. 
- [[timestamp]]
- 
     The presentation timestamp, given in microseconds. 
- [[duration]]
- 
     The presentation duration, given in microseconds. 
- [[byte length]]
- 
     The byte length of [[internal data]].
8.1.2. Constructors
 EncodedAudioChunk(init)  
   - 
     Let chunk be a new EncodedAudioChunkobject, initialized as follows- 
       Assign init.typeto[[type]].
- 
       Assign init.timestampto[[timestamp]].
- 
       Assign a copy of init.datato[[internal data]].
- 
       Assign init.data.byteLengthto[[byte length]];
 
- 
       
- 
     Return chunk. 
8.1.3. Attributes
- type, of type EncodedAudioChunkType, readonly
- 
     Returns the value of [[type]].
- timestamp, of type long long, readonly
- 
     Returns the value of [[timestamp]].
- duration, of type unsigned long, readonly
- 
     Returns the value of [[duration]].
- byteLength, of type unsigned long, readonly
- 
     Returns the value of [[byte length]].
8.1.4. Methods
- copyTo(destination)
- 
     When invoked, run these steps: - 
       If the [[byte length]]of thisEncodedAudioChunkis greater than in destination, throw aTypeError.
- 
       Copy the [[internal data]]into destination.
 
- 
       
8.2. EncodedVideoChunk Interface
[Exposed =(Window ,DedicatedWorker )]interface {EncodedVideoChunk constructor (EncodedVideoChunkInit );init readonly attribute EncodedVideoChunkType type ;readonly attribute long long timestamp ; // microsecondsreadonly attribute unsigned long long ?duration ; // microsecondsreadonly attribute unsigned long byteLength ;undefined copyTo ([AllowShared ]BufferSource ); };destination dictionary {EncodedVideoChunkInit required EncodedVideoChunkType ; [type EnforceRange ]required long long ; // microseconds [timestamp EnforceRange ]unsigned long long ; // microsecondsduration required BufferSource ; };data enum {EncodedVideoChunkType ,"key" , };"delta" 
8.2.1. Internal Slots
- [[internal data]]
- 
     An array of bytes representing the encoded chunk data. 
- [[type]]
- 
     The EncodedAudioChunkTypeof thisEncodedVideoChunk;
- [[timestamp]]
- 
     The presentation timestamp, given in microseconds. 
- [[duration]]
- 
     The presentation duration, given in microseconds. 
- [[byte length]]
- 
     The byte length of [[internal data]].
8.2.2. Constructors
 EncodedVideoChunk(init)  
   - 
     Let chunk be a new EncodedVideoChunkobject, initialized as follows- 
       Assign init.typeto[[type]].
- 
       Assign init.timestampto[[timestamp]].
- 
       If duration is present in init, assign init.durationto[[duration]]. Otherwise, assignnullto[[duration]].
- 
       Assign a copy of init.datato[[internal data]].
- 
       Assign init.data.byteLengthto[[byte length]];
 
- 
       
- 
     Return chunk. 
8.2.3. Attributes
- type, of type EncodedVideoChunkType, readonly
- 
     Returns the value of [[type]].
- timestamp, of type long long, readonly
- 
     Returns the value of [[timestamp]].
- duration, of type unsigned long long, readonly, nullable
- 
     Returns the value of [[duration]].
- byteLength, of type unsigned long, readonly
- 
     Returns the value of [[byte length]].
8.2.4. Methods
- copyTo(destination)
- 
     When invoked, run these steps: - 
       If [[byte length]]is greater than the[[byte length]]of destination, throw aTypeError.
- 
       Copy the [[internal data]]into destination.
 
- 
       
9. Raw Media Interfaces
These interfaces represent unencoded (raw) media.9.1. Memory Model
9.1.1. Background
This section is non-normative.
Decoded media data may occupy a large amount of system memory. To minimize the
need for expensive copies, this specification defines a scheme for reference
counting (clone() and close()).
NOTE: Authors should take care to invoke close() immediately when frames are
    no longer needed.
9.1.2. Reference Counting
A media resource is storage for the actual pixel data or the audio
sample data described by a VideoFrame or AudioData.
The AudioData [[resource reference]] and VideoFrame [[resource reference]] internal slots hold a reference to a media resource.
VideoFrame.clone() and AudioData.clone() return new objects whose [[resource reference]] points to the same media resource as the original
object.
VideoFrame.close() and AudioData.close() will clear their [[resource reference]] slot, releasing the reference their media resource.
A media resource must remain alive at least as long as it continues to be
referenced by a [[resource reference]].
NOTE: When a media resource is no longer referenced by a [[resource reference]], the resource may be destroyed. User agents are
    encouraged to destroy such resources quickly to reduce memory pressure and
    facilitate resource reuse.
9.2. AudioData Interface
[Exposed =(Window ,DedicatedWorker )]interface {AudioData constructor (AudioDataInit );init readonly attribute AudioSampleFormat format ;readonly attribute float sampleRate ;readonly attribute unsigned long numberOfFrames ;readonly attribute unsigned long numberOfChannels ;readonly attribute unsigned long long duration ; // microsecondsreadonly attribute long long timestamp ; // microsecondsunsigned long allocationSize (AudioDataCopyToOptions );options undefined copyTo ([AllowShared ]BufferSource ,destination AudioDataCopyToOptions );options AudioData clone ();undefined close (); };dictionary {AudioDataInit required AudioSampleFormat ; [format EnforceRange ]required float ; [sampleRate EnforceRange ]required unsigned long ; [numberOfFrames EnforceRange ]required unsigned long ; [numberOfChannels EnforceRange ]required long long ; // microsecondstimestamp required BufferSource ; };data 
9.2.1. Internal Slots
- [[detached]]
- 
     Boolean indicating whether close()was invoked on thisAudioData.
- [[resource reference]]
- 
     A reference to a media resource that stores the audio sample data for this AudioData.
- [[format]]
- 
     The AudioSampleFormatused by thisAudioData.
- [[sample rate]]
- 
     The sample-rate, in Hz, for this AudioData.
- [[number of frames]]
- [[number of channels]]
- 
     The number of audio channels for this AudioData.
- [[timestamp]]
- 
     The presentation timestamp, in microseconds, for this AudioData.
9.2.2. Constructors
 AudioData(init)  
   - 
     Let frame be a new AudioDataobject, initialized as follows:- 
       Assign falseto[[detached]].
- 
       Assign init. formatto[[format]].
- 
       Assign init. sampleRateto[[sample rate]].
- 
       Assign init. numberOfFramesto[[number of frames]].
- 
       Assign init. numberOfChannelsto[[number of channels]].
- 
       Assign init. timestampto[[timestamp]].
- 
       Let resource be a media resource containing a copy of init. data.
- 
       Let resourceReference be a reference to resource. 
- 
       Assign resourceReference to [[resource reference]].
 
- 
       
- 
     Return frame. 
9.2.3. Attributes
- format, of type AudioSampleFormat, readonly
- 
     The AudioSampleFormatused by thisAudioData.The formatgetter steps are to return[[format]].
- sampleRate, of type float, readonly
- 
     The sample-rate, in Hz, for this AudioData.The sampleRategetter steps are to return[[sample rate]].
- numberOfFrames, of type unsigned long, readonly
- 
     The number of frames for this AudioData.The numberOfFramesgetter steps are to return[[number of frames]].
- numberOfChannels, of type unsigned long, readonly
- 
     The number of audio channels for this AudioData.The numberOfChannelsgetter steps are to return[[number of channels]].
- timestamp, of type long long, readonly
- 
     The presentation timestamp, in microseconds, for this AudioData.The numberOfChannelsgetter steps are to return[[timestamp]].
- duration, of type unsigned long long, readonly
- 
     The duration, in microseconds, for this AudioData.The durationgetter steps are to:- 
       Let microsecondsPerSecond be 1,000,000.
- 
       Let durationInSeconds be the result of dividing [[number of frames]]by[[sample rate]].
- 
       Return the product of durationInSeconds and microsecondsPerSecond. 
 
- 
       
9.2.4. Methods
- allocationSize(options)
- 
     Returns the number of bytes required to hold the samples as described by options. When invoked, run these steps: - 
       Let copyElementCount be the result of running the Compute Copy Element Count algorithm with options. 
- 
       Let bytesPerSample be the number of bytes per sample, as defined by the [[format]].
- 
       Return the product of multiplying bytesPerSample by copyElementCount. 
 
- 
       
- copyTo(destination, options)
- 
     Copies the samples from the specified plane of the AudioDatato the destination buffer.When invoked, run these steps: - 
       If the value of frame’s [[detached]]internal slot istrue, throw anInvalidStateErrorDOMException.
- 
       Let copyElementCount be the result of running the Compute Copy Element Count algorithm with options. 
- 
       Let bytesPerSample be the number of bytes per sample, as defined by the [[format]].
- 
       If the product of multiplying bytesPerSample by copyElementCount is greater than destination.byteLength, throw aRangeError.
- 
       Let resource be the media resource referenced by [[resource reference]].
- 
       Let planeFrames be the region of resource corresponding to options. planeIndex.
- 
       Copy elements of planeFrames into destination, starting with the frame positioned at options. frameOffsetand stopping after copyElementCount samples have been copied.
 
- 
       
- clone()
- 
     Creates a new AudioData with a reference to the same media resource. When invoked, run these steps: - 
       If the value of frame’s [[detached]]internal slot istrue, throw anInvalidStateErrorDOMException.
- 
       Return the result of running the Clone AudioData algorithm with this. 
 
- 
       
- close()
- 
     Clears all state and releases the reference to the media resource. Close is final. When invoked, run these steps: - 
       Assign trueto the[[detached]]internal slot.
- 
       Assign nullto[[resource reference]].
 
- 
       
9.2.5. Algorithms
- Compute Copy Element Count (with options)
- 
     Run these steps: - 
       Let frameCount be the number of frames in the plane identified by options. planeIndex.
- 
       If options. frameOffsetis greater than or equal to frameCount, throw aRangeError.
- 
       Let copyFrameCount be the difference of subtracting options. frameOffsetfrom frameCount.
- 
       If options. frameCountexists:- 
         If options. frameCountis greater than copyFrameCount, throw aRangeError.
- 
         Otherwise, assign options. frameCountto copyFrameCount.
 
- 
         
- 
       Let elementCount be copyFrameCount. 
- 
       If [[format]]describes an interleavedAudioSampleFormat, mutliply elementCount by[[number of channels]]
- 
       return elementCount. 
 
- 
       
- Clone AudioData (with data)
- 
     Run these steps: - 
       Let clone be a new AudioDatainitialized as follows:- 
         Let resource be the media resource referenced by data’s [[resource reference]].
- 
         Let reference be a new reference to resource. 
- 
         Assign reference to [[resource reference]].
- 
         Assign the values of data’s [[detached]],[[format]],[[sample rate]],[[number of frames]],[[number of channels]], and[[timestamp]]slots to the corresponding slots in clone.
 
- 
         
- 
       Return clone. 
 
- 
       
9.2.6. AudioDataCopyToOptions
dictionary {AudioDataCopyToOptions required unsigned long planeIndex ;unsigned long frameOffset = 0;unsigned long frameCount ; };
- planeIndex, of type unsigned long
- 
     The index identifying the plane to copy from. 
- frameOffset, of type unsigned long, defaulting to- 0
- 
     An offset into the source plane data indicating which frame to begin copying from. Defaults to 0.
- frameCount, of type unsigned long
- 
     The number of frames to copy. If not provided, the copy will include all frames in the plane beginning with frameOffset.
9.3. Audio Sample Format
An audio sample format describes the numeric type used to represent a
single sample (e.g. 32-bit floating point) and the arrangement of samples from
different channels as either interleaved or planar. The audio
sample type refers solely to the numeric type and interval used to store
the data, this is U8, S16, S24, S32, or FLT for respectively
unsigned 8-bits, signed 16-bits, signed 32-bits, signed 32-bits, and 32-bits
floating point number. The audio buffer
arrangement refers solely to the way the samples are laid out in memory
(planar or interleaved).
A sample refers to a single value that is the magnitude of a signal at a particular point in time in a particular channel.
A frame or (sample-frame) refers to a set of values of all channels of a multi-channel signal, that happen at the exact same time.
Note: Consequently if an audio signal is mono (has only one channel), a frame and a sample refer to the same thing.
All audio samples in this specification are using linear pulse-code modulation (Linear PCM): quantization levels are uniform between values.
Note: The Web Audio API, that is expected to be used with this specification, also uses Linear PCM.
enum {AudioSampleFormat "U8" ,"S16" ,"S24" ,"S32" ,"FLT" ,"U8P" ,"S16P" ,"S24P" ,"S32P" ,"FLTP" , };
- U8
- 
     8-bit unsigned integer samples with interleaved channel arrangement. 
- S16
- 
     16-bit signed integer samples with interleaved channel arrangement. 
- S24
- 
     32-bit signed integer samples with interleaved channel arrangement, holding value in the 24-bit of lowest significance. 
- S32
- 
     32-bit signed integer samples with interleaved channel arrangement. 
- FLT
- U8P
- 
     8-bit unsigned integer samples with planar channel arrangement. 
- S16P
- 
     16-bit signed integer samples with planar channel arrangement. 
- S24P
- 
     32-bit signed integer samples with planar channel arrangement, holding value in the 24-bit of lowest significance. 
- S32P
- 
     32-bit signed integer samples with planar channel arrangement. 
- FLTP
9.3.1. Arrangement of audio buffer
When an AudioData has an AudioSampleFormat that is interleaved, the audio samples from different channels are laid out
consecutively in the same buffer, in the order described in the section § 9.3.3 Audio channel ordering. The AudioData has a single plane, that contains a
number of elements therefore equal to [[number of frames]] * [[number of channels]].
When an AudioData has an AudioSampleFormat that is planar, the audio samples from different channels are laid out
in different buffers, themselves arranged in an order described in the section § 9.3.3 Audio channel ordering. The AudioData has a number of planes equal to the AudioData's [[number of channels]]. Each plane contains [[number of frames]] elements.
Note: The Web Audio API currently uses FLTP exclusively.
9.3.2. Magnitude of the audio samples
The minimum value and maximum value of an audio sample, for a particular audio sample type, are the values below which (respectively above which) audio clipping might occur. They are otherwise regular types, that can hold values outside this interval during intermediate processing.
The bias value for an audio sample type is the value that often corresponds to the middle of the range (but often the range is not symmetrical). An audio buffer comprised only of values equal to the bias value is silent.
| Sample type | IDL type | Minimum value | Bias value | Maximum value | 
|---|---|---|---|---|
| U8 | octet | 0 | 128 | +255 | 
| S16 | short | -32768 | 0 | +32767 | 
| S24 | long | -8388608 | 0 | +8388607 | 
| S32 | long | -2147483648 | 0 | +2147483647 | 
| FLT | float | -1.0 | 0.0 | +1.0 | 
Note: There is no data type that can hold 24 bits of information conveniently, but audio content using 24-bit samples is common, so 32-bits integers are commonly used to hold 24-bit content.
9.3.3. Audio channel ordering
When decoding, the ordering of the audio channels in the resulting AudioData MUST be the same as what is present in the EncodedAudioChunk.
When encoding, the ordering of the audio channels in the resulting EncodedAudioChunk MUST be the same as what is preset in the given AudioData;
In other terms, no channel reordering is performed when encoding and decoding.
Note: The container either implies or specifies the channel mapping: the channel attributed to a particular channel index.
9.4. VideoFrame Interface
NOTE: VideoFrame is a CanvasImageSource. A VideoFrame may be
    passed to any method accepting a CanvasImageSource, including CanvasDrawImage's drawImage().
[Exposed =(Window ,DedicatedWorker )]interface {VideoFrame constructor (CanvasImageSource ,image optional VideoFrameInit = {});init constructor (sequence <(Plane or PlaneInit )>,planes VideoFramePlaneInit );init readonly attribute PixelFormat format ;readonly attribute FrozenArray <Plane >?planes ;readonly attribute unsigned long codedWidth ;readonly attribute unsigned long codedHeight ;readonly attribute unsigned long cropLeft ;readonly attribute unsigned long cropTop ;readonly attribute unsigned long cropWidth ;readonly attribute unsigned long cropHeight ;readonly attribute unsigned long displayWidth ;readonly attribute unsigned long displayHeight ;readonly attribute unsigned long long ?duration ; // microsecondsreadonly attribute long long ?timestamp ; // microsecondsVideoFrame clone ();undefined close (); };dictionary {VideoFrameInit unsigned long long ; // microsecondsduration long long ; // microseconds };timestamp dictionary {VideoFramePlaneInit required PixelFormat ; [format EnforceRange ]required unsigned long ; [codedWidth EnforceRange ]required unsigned long ; [codedHeight EnforceRange ]unsigned long ; [cropLeft EnforceRange ]unsigned long ; [cropTop EnforceRange ]unsigned long ; [cropWidth EnforceRange ]unsigned long ; [cropHeight EnforceRange ]unsigned long ; [displayWidth EnforceRange ]unsigned long ; [displayHeight EnforceRange ]unsigned long long ; // microseconds [duration EnforceRange ]long long ; // microseconds };timestamp 
9.4.1. Internal Slots
- [[detached]]
- 
     A boolean indicating whether destroy()was invoked and underlying resources have been released.
- [[resource reference]]
- 
     A reference to the media resource that stores the pixel data for this frame. 
- [[format]]
- 
     A PixelFormatdescribing the pixel format of theVideoFrame.
- [[planes]]
- 
     A list of Planes describing the memory layout of the pixel data inVideoFrame. The number ofPlanes and their semantics are determined by[[format]].
- [[coded width]]
- 
     Width of the VideoFramein pixels, prior to any cropping or aspect ratio adjustments.
- [[coded height]]
- 
     Height of the VideoFramein pixels, prior to any cropping or aspect ratio adjustments.
- [[crop left]]
- 
     The number of pixels to remove from the left of the VideoFrame, prior to aspect ratio adjustments.
- [[crop top]]
- 
     The number of pixels to remove from the top of the VideoFrame, prior to aspect ratio adjustments.
- [[crop width]]
- 
     The width of pixels to include in the crop, starting from cropLeft. 
- [[crop height]]
- 
     The height of pixels to include in the crop, starting from cropLeft. 
- [[display width]]
- 
     Width of the VideoFramewhen displayed after applying aspect ratio adjustments.
- [[display height]]
- 
     Height of the VideoFramewhen displayed after applying aspect ratio adjustments.
- [[duration]]
- 
     The presentation duration, given in microseconds. The duration is copied from the EncodedVideoChunkcorresponding to thisVideoFrame.
- [[timestamp]]
- 
     The presentation timestamp, given in microseconds. The timestamp is copied from the EncodedVideoChunkcorresponding to thisVideoFrame.
9.4.2. Constructors
 VideoFrame(image, init) 
- 
     Check the usability of the image argument. If this throws an exception or returns bad, then throw an InvalidStateErrorDOMException.
- 
     If the origin of image’s image data is not same origin with the entry settings object's origin, then throw a SecurityErrorDOMException.
- 
     Let frame be a new VideoFrame.
- 
     Switch on image: 
- 
       
       
- 
         If image’s media data has no natural dimensions (e.g., it’s a vector graphic with no specified content size), then throw an InvalidStateErrorDOMException.
- 
         Let resource be a new media resource containing a copy of image’s media data. If this is an animated image, image’s bitmap data must only be taken from the default image of the animation (the one that the format defines is to be used when animation is not supported or is disabled), or, if there is no such image, the first frame of the animation. 
- 
         Let width and height be the natural width and natural height of image. 
- 
         Run the Initialize Frame With Resource and Size algorithm with init, frame, resource, width, and height 
 
- 
       
       - 
         If image’s networkStateattribute isNETWORK_EMPTY, then throw anInvalidStateErrorDOMException.
- 
         Let currentPlaybackFrame be the VideoFrameat the current playback position.
- 
         Run the Initialize Frame From Other Frame algorithm with init, frame, and currentPlaybackFrame. 
 
- 
         
- 
       
       
- 
         Let resource be a new media resource containing a copy of image’s bitmap data. NOTE: Implementers should avoid a deep copy by using reference counting where feasible. 
- 
         Let width be image.widthand height beimage.height.
- 
         Run the Initialize Frame With Resource and Size algorithm with init, frame, resource, width, and height. 
 
- 
       
       - 
         Run the Initialize Frame From Other Frame algorithm with init, frame, and image. 
 
- 
         
 
- 
     Return frame. 
 VideoFrame(planes, init) 
- 
     If init is not a valid VideoFramePlaneInit, throw a TypeError.
- 
     If planes is incompatible with the given format(e.g. wrong number of planes), throw aTypeError.The spec should list additional format specific validation steps ( e.g. number and order of planes, acceptable sizing, etc...). See #165. 
- 
     Let resource be a new media resource allocated in accordance with init. The spec should define explicit rules for each PixelFormatand reference them in the steps above. See #165.NOTE: The user agent may choose to allocate resource with a larger coded size and plane strides to improve memory alignment. Increases will be reflected by codedWidth,codedHeight, andstride.
- 
     Let resourceReference be a reference to resource. 
- 
     Let frame be a new VideoFrameobject initialized as follows:- 
       Assign resourceReference to [[resource reference]].
- 
       Assign formatto[[format]].
- 
       Assign a new list to [[planes]].
- 
       For each planeInit in planes: - 
         Copy planeInit. srcto resource.NOTE: The user agent may use cropLeftandcropTopto copy only the crop region. It may also reposition the crop region within resource. The final position will be reflected bycropLeftandcropTop.
- 
         Let plane be a new Planeinitialized as follows:- 
           Assign frame to [[parent frame]].
- 
           Let resourceStride be the stride of the plane corresponding to planeInit in resource. The spec should provide a definition (and possibly diagrams) for stride. See #166. 
- 
           Assign resourceStride to stride.
 
- 
           
- 
         Append plane to [[planes]].
 
- 
         
- 
       Let resourceCodedWidth be the coded width of resource. 
- 
       Let resourceCodedHeight be the coded height of resource. 
- 
       Let resourceCropLeft be the left offset of the crop origin of resource. 
- 
       Let resourceCropTop be the top offset of the crop origin of resource. The spec should provide definitions (and possibly diagrams) for coded size, crop size, and display size. See #166. 
- 
       Assign resourceCodedWidth, resourceCodedHeight, resourceCropLeft, and resourceCropTop to [[coded width]],[[coded height]],[[crop left]], and[[crop top]]respectively.
- 
       If init. cropWidthexists, assign it to[[crop width]]. Otherwise, assign[[coded width]]to[[crop width]].
- 
       If init. cropHeightexists, assign it to[[crop height]]. Otehrwise, assign[[coded height]]to[[crop height]].
- 
       If init. displayWidthexists, assign it to[[display width]]. Otherwise, assign[[crop width]]to[[display width]].
- 
       If init. displayHeightexists, assign it to[[display height]]. Otherwise, assign[[crop height]]to[[display height]].
- 
       Assign init’s timestampanddurationto[[timestamp]]and[[duration]]respectively.
 
- 
       
- 
     Return frame. 
9.4.3. Attributes
- format, of type PixelFormat, readonly
- 
     Describes the arrangement of bytes in each plane as well as the number and order of the planes. The formatgetter steps are to return[[format]].
- planes, of type FrozenArray<Plane>, readonly, nullable
- 
     Holds pixel data data, laid out as described by format and Plane attributes. The planesgetter steps are to return[[planes]].
- codedWidth, of type unsigned long, readonly
- 
     Width of the VideoFramein pixels, prior to any cropping or aspect ratio adjustments.The codedWidthgetter steps are to return[[coded width]].
- codedHeight, of type unsigned long, readonly
- 
     Height of the VideoFrame in pixels, prior to any cropping or aspect ratio adjustments. The codedHeightgetter steps are to return[[coded height]].
- cropLeft, of type unsigned long, readonly
- 
     The number of pixels to remove from the left of the VideoFrame, prior to aspect ratio adjustments. The cropLeftgetter steps are to return[[crop left]].
- cropTop, of type unsigned long, readonly
- 
     The number of pixels to remove from the top of the VideoFrame, prior to aspect ratio adjustments. The cropTopgetter steps are to return[[crop top]].
- cropWidth, of type unsigned long, readonly
- 
     The width of pixels to include in the crop, starting from cropLeft. The cropWidthgetter steps are to return[[crop width]].
- cropHeight, of type unsigned long, readonly
- 
     The height of pixels to include in the crop, starting from cropLeft. The cropHeightgetter steps are to return[[crop height]].
- displayWidth, of type unsigned long, readonly
- 
     Width of the VideoFrame when displayed after applying aspect ratio adjustments. The displayWidthgetter steps are to return[[display width]].
- displayHeight, of type unsigned long, readonly
- 
     Height of the VideoFrame when displayed after applying aspect ratio adjustments. The displayHeightgetter steps are to return[[display height]].
- timestamp, of type long long, readonly, nullable
- 
     The presentation timestamp, given in microseconds. The timestamp is copied from the EncodedVideoChunkcorresponding to this VideoFrame.The timestampgetter steps are to return[[timestamp]].
- duration, of type unsigned long long, readonly, nullable
- 
     The presentation duration, given in microseconds. The duration is copied from the EncodedVideoChunkcorresponding to this VideoFrame.The durationgetter steps are to return[[duration]].
9.4.4. Methods
- clone()
- 
     Creates a new VideoFramewith a reference to the same media resource.When invoked, run these steps: - 
       If the value of frame’s [[detached]]internal slot istrue, throw anInvalidStateErrorDOMException.
- 
       Return the result of running the Clone VideoFrame algorithm with this. 
 
- 
       
- close()
- 
     Clears all state and releases the reference to the media resource. Close is final. When invoked, run these steps: - 
       Assign nullto[[resource reference]].
- 
       Assign trueto[[detached]].
- 
       Assign ""toformat.
- 
       Assign nulltoplanes.
- 
       Assign 0tocodedWidth,codedHeight,cropLeft,cropTop,cropWidth,cropHeight,displayWidth, anddisplayHeight.
 
- 
       
9.4.5. Algorithms
Create a VideoFrame (with output, timestamp, duration, displayAspectWidth, and displayAspectHeight)- 
     Let planes be a sequence of Planes containing the decoded video frame data from output.
- 
     Let pixelFormat be the PixelFormatof planes.
- 
     Let init be a VideoFramePlaneInitwith the following keys:- 
       Assign timestamp to timestamp.
- 
       Assign duration to duration.
- 
       Let codedWidthandcodedHeightbe the width and height of the decoded video frame output in pixels, prior to any cropping or aspect ratio adjustments.
- 
       Let cropLeft,cropTop,cropWidth, andcropHeightbe the crop region of the decoded video frame output in pixels, prior to any aspect ratio adjustments.
- 
       Let displayWidth and displayHeight be the display size of the decoded frame in pixels. 
- 
       If displayAspectWidth and displayAspectHeight are provided, increase displayWidth or displayHeight until the ratio of displayWidth to displayHeight matches the ratio of displayAspectWidth to displayAspectHeight. 
- 
       Assign the value of displayWidth and displayHeight to displayWidthanddisplayHeightrespectively.
 
- 
       
- 
     Return a new VideoFrame, constructed with pixelFormat, planes, and init.
- To check if a VideoFramePlaneInitis a valid VideoFramePlaneInit, run these steps:
- 
     - 
       If codedWidth= 0 orcodedHeight= 0,returnfalse.
- 
       If cropWidth= 0 orcropHeight= 0, returnfalse.
- 
       If cropTop+cropHeight>=codedHeight, returnfalse.
- 
       If cropLeft+cropWidth>=codedWidth, returnfalse.
- 
       If displayWidth= 0 ordisplayHeight= 0, returnfalse.
- 
       Return true.
 
- 
       
- Initialize Frame From Other Frame (with init, frame, and otherFrame)
- 
     - 
       Let resource be the media resource referenced by otherFrame’s [[resource reference]].
- 
       Assign a new reference for resource to frame’s [[resource reference]].
- 
       Assign the following attributes from otherFrame to frame: format,codedWidth,codedHeight,cropLeft,cropTop,cropWidth,cropHeight,displayWidth,displayHeight.
- 
       Let planes be a new list. 
- 
       For each otherPlane in otherFrame. planes:- 
         Let plane be a new Plane.
- 
         Assign a reference for frame to plane’s [[parent frame]].
- 
         Assign the following attributes from otherPlane to plane: stride,rows,length.
- 
         Append plane to planes. 
 
- 
         
- 
       Assign planes to frame. planes.
- 
       If durationexists in init, assign it to frame.duration. Otherwise, assign otherFrame.durationto frame.duration.
- 
       If timestampexists in init, assign it to frame.timestamp. Otherwise, assign otherFrame.timestampto frame.timestamp.
 
- 
       
- Initialize Frame With Resource and Size (with init, frame, resource, width and height)
- 
     - 
       Assign a new reference for resource to frame’s [[resource reference]].
- 
       If resource uses a recognized PixelFormat:- 
         Assign the PixelFormatof resource toformat.
- 
         Let planes be a list of Planes describing the media resource in accordance with theformat.The spec should define explicit rules for each PixelFormatand reference them in the step above. See #165.
- 
         Assign planes to planes.
 
- 
         
- 
       Otherwise (resource does not use a recognized PixelFormat):
- 
       Assign width to the following attributes of frame: codedWidth,cropWidth,displayWidth.
- 
       Assign height to the following attributes of frame: codedHeight,cropHeight,displayHeight.
 
- 
       
- Clone VideoFrame (with frame)
- 
     - 
       Let clone be a new VideoFrameinitialized as follows:- 
         Assign frame. [[resource reference]]to[[resource reference]].
- 
         For each plane in planes:
- 
         Assign all remaining attributes of frame ( codedWidth,codedHeight, etc.) to those of the same name in clone.
 
- 
         
- 
       Return clone. 
 
- 
       
9.5. Plane Interface
APlane is solely constructed by its VideoFrame. During construction,
    the User Agent may use knowledge of the frame’s PixelFormat to add
    padding to the Plane to improve memory alignment. 
   A Plane cannot be used after the VideoFrame is destroyed. A new VideoFrame can be assembled from existing Planes, and the new VideoFrame will remain valid when the original is destroyed. This makes
    it possible to efficiently add an alpha plane to an existing VideoFrame.
[Exposed =(Window ,DedicatedWorker )]interface {Plane readonly attribute unsigned long stride ;readonly attribute unsigned long rows ;readonly attribute unsigned long length ;undefined readInto (ArrayBufferView ); };dst dictionary {PlaneInit required BufferSource ; [src EnforceRange ]required unsigned long ; [stride EnforceRange ]required unsigned long ; };rows 
9.5.1. Internal Slots
- [[parent frame]]
- Refers to the VideoFramethat constructed and owns this plane.
9.5.2. Attributes
- stride, of type unsigned long, readonly
- The width of each row including any padding.
- rows, of type unsigned long, readonly
- The number of rows.
- length, of type unsigned long, readonly
- The total byte length of the plane (stride * rows).
9.5.3. Methods
readInto(dst) 
   Copies the plane data into dst.
When invoked, run these steps:
- 
     If [[parent frame]]has been destroyed, throw anInvalidStateError.
- 
     If lengthis greater than |dst.byteLength|, throw aTypeError.
- 
     Let resource be the media resource referenced by [[parent frame]]'s[[resource reference]].
- 
     Let plane bytes be the region of bytes in media resource corresponding to this plane. 
- 
     Copy the plane bytes into dst. 
9.6. Pixel Format
Pixel formats describe the arrangement of bytes in each plane as well as the number and order of the planes.NOTE: This section needs work. We expect to add more pixel formats and offer much more verbose definitions. For now, please see http://www.fourcc.org/pixel-format/yuv-i420/ for a more complete description.
enum {PixelFormat "I420" };
- I420
- Planar 4:2:0 YUV.
10. Image Decoding
10.1. Background
This section is non-normative.
Image codec definitions are typically accompanied by a definition for a
corresponding file format. Hence image decoders often perform both duties of
unpacking (demuxing) as well as decoding the encoded image data. The WebCodecs ImageDecoder follows this pattern, which motivates an interface design that
is notably different from that of VideoDecoder and AudioDecoder.
In spite of these differences, ImageDecoder uses the same codec processing model as the other codec interfaces. Additionally, ImageDecoder uses the VideoFrame interface to describe decoded outputs.
10.2. ImageDecoder Interface
[Exposed =(Window ,DedicatedWorker )]interface {ImageDecoder constructor (ImageDecoderInit );init readonly attribute boolean complete ;readonly attribute Promise <undefined >completed ;readonly attribute ImageTrackList tracks ;Promise <ImageDecodeResult >decode (optional ImageDecodeOptions = {});options undefined reset ();undefined close ();static Promise <boolean >isTypeSupported (DOMString ); };type 
10.2.1. Internal Slots
- [[ImageTrackList]]
- 
     An ImageTrackListdescribing the tracks found in[[encoded data]]
- [[complete]]
- 
     A boolean indicating whether [[encoded data]]is completely buffered.
- [[completed promise]]
- 
     The promise used to signal when [[complete]]becomestrue.
- [[codec implementation]]
- 
     An underlying image decoder implementation provided by the User Agent. 
- [[encoded data]]
- 
     A byte sequence containing the encoded image data to be decoded. 
- [[prefer animation]]
- 
     A boolean reflecting the value of preferAnimationgiven at construction.
- [[pending decode promises]]
- 
     A list of unresolved promises returned by calls to decode(). 
- [[internal selected track index]]
- 
     Identifies the image track within [[encoded data]]that is used by decoding algorithms on the codec thread.
- [[tracks established]]
- 
     A boolean indicating whether the track list has been established in [[ImageTrackList]].
- [[closed]]
- 
     A boolean indicating that the ImageDecoder is in a permanent closed state and can no longer be used. 
- [[progressive frame generations]]
- 
     A mapping of frame indices to Progressive Image Frame Generations. The values represent the Progressive Image Frame Generation for the VideoFramewhich was most recently output by a call todecode()with the given frame index.
10.2.2. Constructor
- ImageDecoder(init)
- 
     NOTE: Calling decode()on the constructedImageDecoderwill trigger aNotSupportedErrorif the user agent does not support type. Authors should first check support by callingisTypeSupported()with type. User agents are not required to support any particular type.When invoked, run these steps: - 
       If init is not valid ImageDecoderInit, throw a TypeError.
- 
       Let d be a new ImageDecoderobject. In the steps below, all mentions ofImageDecodermembers apply to d unless stated otherwise.
- 
       Assign [[ImageTrackList]]a newImageTrackListinitialized as follows:- 
         Assign a new list to [[track list]].
- 
         Assign -1to[[selected index]].
 
- 
         
- 
       Assign nullto[[codec implementation]].
- 
       If init.preferAnimationexists, assigninit.preferAnimationto the[[prefer animation]]internal slot. Otherwise, assign 'null' to[[prefer animation]]internal slot.
- 
       Assign a new list to [[pending decode promises]].
- 
       Assign -1to[[internal selected track index]].
- 
       Assign falseto[[tracks established]].
- 
       Assign falseto[[closed]].
- 
       Assign a new map to [[progressive frame generations]].
- 
       If init’s datamember is of typeReadableStream:- 
         Assign a new list to [[encoded data]].
- 
         Assign falseto[[complete]]
- 
         Queue a control message to configure the image decoder with init. 
- 
         Let reader be the result of getting a reader for data.
- 
         In parallel, perform the Fetch Stream Data Loop on d with reader. 
 
- 
         
- 
       Otherwise: - 
         Assert that init.datais of typeBufferSource.
- 
         Assign a copy of init.datato[[encoded data]].
- 
         Assign trueto[[complete]].
- 
         Reslove [[completed promise]].
- 
         Queue a control message to configure the image decoder with init. 
- 
         Queue a control message to decode track metadata. 
 
- 
         
- 
       return d. 
 Running a control message to configure the image decoder means running these steps: - 
       Let supported be the result of running the Check Type Support algorithm with init.type.
- 
       If supported is false, queue a task on the control thread event loop to run the Close ImageDecoder algorithm with aNotSupportedErrorDOMExceptionand abort these steps.
- 
       If supported is true, assign the[[codec implementation]]internal slot with an implementation supportinginit.type
- 
       Configure [[codec implementation]]in accordance with the values given forpremultiplyAlpha,colorSpaceConversion,desiredWidth, anddesiredHeight.
 Running a control message to decode track metadata means running these steps: - 
       Run the Establish Tracks algorithm. 
 
- 
       
10.2.3. Attributes
- complete, of type boolean, readonly
- 
     Indicates whether [[encoded data]]is completely buffered.The completegetter steps are to return[[complete]].
- completed, of type Promise<undefined>, readonly
- 
     The promise used to signal when completebecomestrue.The completedgetter steps are to return[[completed promise]].
- tracks, of type ImageTrackList, readonly
- 
     Returns a live ImageTrackList, which provides metadata for the available tracks and a mechanism for selecting a track to decode.The tracksgetter steps are to return[[ImageTrackList]].
10.2.4. Methods
- decode(options)
- 
     Enqueues a control message to decode the frame according to options. When invoked, run these steps: - 
       If [[closed]]istrue, return aPromiserejected with anInvalidStateErrorDOMException.
- 
       If [[ImageTrackList]]'s[[selected index]]is '-1', return aPromiserejected with anInvalidStateErrorDOMException.
- 
       If options is undefined, assign a newImageDecodeOptionsto options.
- 
       Let promise be a new Promise.
- 
       Queue a control message to decode the image with options, and promise. 
- 
       Append promise to [[pending decode promises]].
- 
       Return promise. 
 Running a control message to decode the image means running these steps: - 
       Wait for [[tracks established]]to becometrue.
- 
       If options. completeFramesOnlyisfalseand the image is a Progressive Image for which the user agent supports progressive decoding, run the Decode Progressive Frame algorithm with options.frameIndexand promise.
- 
       Otherwise, run the Decode Complete Frame algorithm with options. frameIndexand promise.
 
- 
       
- reset()
- 
     Immediately aborts all pending work. When invoked, run the Reset ImageDecoder algorithm with and AbortErrorDOMException.
- close()
- 
     Immediately aborts all pending work and releases system resources. Close is final. When invoked, run the Close ImageDecoder algorithm with and AbortErrorDOMException.
- isTypeSupported(type)
- 
     Returns a promise indicating whether the provided config is supported by the user agent. When invoked, run these steps: - 
       If type is not a valid image MIME type, return a Promiserejected withTypeError.
- 
       Let p be a new Promise.
- 
       In parallel, resolve p with the result of running the Check Type Support algorithm with type. 
- 
       Return p. 
 
- 
       
10.2.5. Algorithms
- Fetch Stream Data Loop (with reader)
- 
     Run these steps: - 
       Let readRequest be the following read request. - chunk steps, given chunk
- 
         - 
           If [[closed]]istrue, abort these steps.
- 
           If chunk is not a Uint8Array object, queue a task on the control thread event loop to run the Close ImageDecoder algorithm with a DataErrorDOMExceptionand abort these steps.
- 
           Let bytes be the byte sequence represented by the Uint8Array object. 
- 
           Append bytes to the [[encoded data]]internal slot.
- 
           If [[tracks established]]isfalse, run the Establish Tracks algorithm.
- 
           Otherwise, run the Update Tracks algorithm. 
- 
           Run the Fetch Stream Data Loop algorithm with reader. 
 
- 
           
- close steps
- 
         - 
           Assign trueto[[complete]]
- 
           Resolve [[completed promise]].
 
- 
           
- error steps
- 
         - 
           Queue a task on the control thread event loop to run the Close ImageDecoder algorithm with a NotReadableErrorDOMException
 
- 
           
 
- 
       Read a chunk from reader given readRequest. 
 
- 
       
- Establish Tracks
- 
     Run these steps: - 
       Assert [[tracks established]]isfalse.
- 
       If [[encoded data]]does not contain enough data to determine the number of tracks:- 
         If completeistrue, queue a task on the control thread event loop to run the Close ImageDecoder algorithm.
- 
         Abort these steps. 
 
- 
         
- 
       If the number of tracks is found to be 0, queue a task on the control thread event loop to run the Close ImageDecoder algorithm and abort these steps.
- 
       Let newTrackList be a new list. 
- 
       For each image track found in [[encoded data]]:- 
         Let newTrack be a new ImageTrack, initialized as follows:- 
           Assign this to [[ImageDecoder]].
- 
           Assign tracksto[[ImageTrackList]].
- 
           If image track is found to be animated, assign trueto newTrack’s[[animated]]internal slot. Otherwise, assignfalse.
- 
           If image track is found to describe a frame count, assign that count to newTrack’s [[frame count]]internal slot. Otherwise, assign0.NOTE: If this was constructed with dataas aReadableStream, theframeCountmay change as additional bytes are appended to[[encoded data]]. See the Update Tracks algorithm.
- 
           If image track is found to describe a repetition count, assign that count to [[repetition count]]internal slot. Otherwise, assign0.NOTE: A value of Infinityindicates infinite repetitions.
- 
           Assign falseto newTrack’s[[selected]]internal slot.
 
- 
           
- 
         Append newTrack to newTrackList. 
 
- 
         
- 
       Let selectedTrackIndex be the result of running the Get Default Selected Track Index algorithm with newTrackList. 
- 
       Let selectedTrack be the track at position selectedTrackIndex within newTrackList. 
- 
       Assign trueto selectedTrack’s[[selected]]internal slot.
- 
       Assign selectedTrackIndex to [[internal selected track index]].
- 
       Assign trueto[[tracks established]].
- 
       Queue a task on the control thread event loop to perform the following steps: - 
         Assign newTrackList to the tracks[[track list]]internal slot.
- 
         Assign selectedTrackIndex to tracks[[selected index]].
- 
         Resolve [[ready promise]].
 
- 
         
 
- 
       
- Get Default Selected Track Index (with trackList)
- 
     Run these steps: - 
       If [[encoded data]]identifies a Primary Image Track:- 
         Let primaryTrack be the ImageTrackfrom trackList that describes the Primary Image Track.
- 
         Let primaryTrackIndex be position of primaryTrack within trackList. 
- 
         If [[prefer animation]]isnull, return primaryTrackIndex.
- 
         If primaryTrack. animatedequals[[prefer animation]], return primaryTrackIndex.
 
- 
         
- 
       If any ImageTracks in trackList haveanimatedequal to[[prefer animation]], return the position of the earliest such track in trackList.
- 
       Return 0.
 
- 
       
- Update Tracks
- 
     A track update struct is a struct that consists of a track index ( unsigned long) and a frame count (unsigned long).Run these steps: - 
       Assert [[tracks established]]istrue.
- 
       Let trackChanges be a new list. 
- 
       Let trackList be a copy of tracks'[[track list]].
- 
       For each track in trackList: - 
         Let trackIndex be the position of track in trackList. 
- 
         Let latestFrameCount be the frame count as indicated by [[encoded data]]for the track corresponding to track.
- 
         Assert that latestFrameCount is greater than or equal to track.frameCount.
- 
         If latestFrameCount is greater than track.frameCount:- 
           Let change be a track update struct whose track index is trackIndex and frame count is latestFrameCount. 
- 
           Append change to tracksChanges. 
 
- 
           
 
- 
         
- 
       If tracksChanges is empty, abort these steps. 
- 
       Queue a task on the control thread event loop to perform the following steps: - 
         For each update in trackChanges: - 
           Let updateTrack be the ImageTrackat positionupdate.trackIndexwithintracks'[[track list]].
- 
           Assign update.frameCountto updateTrack’s[[frame count]].
 
- 
           
 
- 
         
 
- 
       
- Decode Complete Frame (with frameIndex and promise)
- 
     - 
       Assert that [[tracks established]]istrue.
- 
       Assert that [[internal selected track index]]is not-1.
- 
       Let encodedFrame be the encoded frame identified by frameIndex and [[internal selected track index]].
- 
       Wait for any of the following conditions to be true (whichever happens first): - 
         [[encoded data]]contains enough bytes to completely decode encodedFrame.
- 
         [[encoded data]]is found to be malformed.
- 
         completeistrue.
- 
         [[closed]]istrue.
 
- 
         
- 
       If [[encoded data]]is found to be malformed, run the Fatally Reject Bad Data algorithm and abort these steps.
- 
       If [[encoded data]]does not contain enough bytes to completely decode encodedFrame, run the Reject Infeasible Decode algorithm with promise and abort these steps.
- 
       Attempt to use [[codec implementation]]to decode encodedFrame.
- 
       If decoding produces an error, run the Fatally Reject Bad Data algorithm and abort these steps. 
- 
       If [[progressive frame generations]]contains an entry keyed by frameIndex, remove the entry from the map.
- 
       Let output be the decoded image data emitted by [[codec implementation]]corresponding to encodedFrame.
- 
       Let decodeResult be a new ImageDecodeResultinitialized as follows:- 
         Assign 'true' to complete.
- 
         Let timestamp and duration be the presentation timestamp and duration for output as described by encodedFrame. If encodedFrame does not describe a timestamp or duration, assign nullto the corresponding variable.
- 
         Assign imagewith the result of running the Create a VideoFrame algorithm with output, timestamp, and duration.
 
- 
         
- 
       Run the Resolve Decode algorithm with promise and decodeResult. 
 
- 
       
- Decode Progressive Frame (with frameIndex and promise)
- 
     - 
       Assert that [[tracks established]]istrue.
- 
       Assert that [[internal selected track index]]is not-1.
- 
       Let encodedFrame be the encoded frame identified by frameIndex and [[internal selected track index]].
- 
       Let lastFrameGeneration be null.
- 
       If [[progressive frame generations]]contains a map entry with the key frameIndex, assign the value of the map entry to lastFrameGeneration.
- 
       Wait for any of the following conditions to be true (whichever happens first): - 
         [[encoded data]]contains enough bytes to decode encodedFrame to produce an output whose Progressive Image Frame Generation exceeds lastFrameGeneration.
- 
         [[encoded data]]is found to be malformed.
- 
         completeistrue.
- 
         [[closed]]istrue.
 
- 
         
- 
       If [[encoded data]]is found to be malformed, run the Fatally Reject Bad Data algorithm and abort these steps.
- 
       Otherwise, if [[encoded data]]does not contain enough bytes to decode encodedFrame to produce an output whose Progressive Image Frame Generation exceeds lastFrameGeneration, run the Reject Infeasible Decode algorithm with promise and abort these steps.
- 
       Attempt to use [[codec implementation]]to decode encodedFrame.
- 
       If decoding produces an error, run the Fatally Reject Bad Data algorithm and abort these steps. 
- 
       Let output be the decoded image data emitted by [[codec implementation]]corresponding to encodedFrame.
- 
       Let decodeResult be a new ImageDecodeResult.
- 
       If output is the final full-detail progressive output corresponding to encodedFrame: - 
         Assign trueto decodeResult’scomplete.
- 
         If [[progressive frame generations]]contains an entry keyed by frameIndex, remove the entry from the map.
 
- 
         
- 
       Otherwise: - 
         Assign falseto decodeResult’scomplete.
- 
         Let frameGeneration be the Progressive Image Frame Generation for output. 
- 
         Add a new entry to [[progressive frame generations]]with key frameIndex and value frameGeneration.
 
- 
         
- 
       Let timestamp and duration be the presentation timestamp and duration for output as described by encodedFrame. If encodedFrame does not describe a timestamp or duration, assign nullto the corresponding variable.
- 
       Assign imagewith the result of running the Create a VideoFrame algorithm with output, timestamp, and duration.
- 
       Remove promise from [[pending decode promises]].
- 
       Resolve promise with decodeResult. 
 
- 
       
- Resolve Decode (with promise and result)
- 
     - 
       Queue a task on the control thread event loop to run these steps: - 
         If [[closed]], abort these steps.
- 
         Assert that promise is an element of [[pending decode promises]].
- 
         Remove promise from [[pending decode promises]].
- 
         Resolve promise with result. 
 
- 
         
 
- 
       
- Reject Infeasible Decode (with promise)
- 
     - 
       Assert that completeistrueor[[closed]]istrue.
- 
       If completeistrue, let exception be aRangeError. Otherwise, let exception be anInvalidStateErrorDOMException.
- 
       Queue a task on the control thread event loop to run these steps: - 
         If [[closed]], abort these steps.
- 
         Assert that promise is an element of [[pending decode promises]].
- 
         Remove promise from [[pending decode promises]].
- 
         Reject promise with exception. 
 
- 
         
 
- 
       
- Fatally Reject Bad Data
- 
     - 
       Queue a task on the control thread event loop to run these steps: - 
         If [[closed]], abort these steps.
- 
         Run the Close ImageDecoder algorithm with an EncodingErrorDOMException.
 
- 
         
 
- 
       
- Check Type Support (with type)
- 
     - 
       If the user agent can provide a codec to support decoding type, return true.
- 
       Otherwise, return false.
 
- 
       
- Reset ImageDecoder (with exception)
- 
     - 
       Signal [[codec implementation]]to abort any active decoding operation.
- 
       For each decodePromise in [[pending decode promises]]:- 
         Reject decodePromise with exception. 
- 
         Remove decodePromise from [[pending decode promises]].
 
- 
         
 
- 
       
- Close ImageDecoder (with exception)
- 
     - 
       Run the Reset ImageDecoder algorithm with exception. 
- 
       Assign trueto[[closed]].
- 
       Clear [[codec implementation]]and release associated system resources.
- 
       Remove all entries from [[ImageTrackList]].
- 
       Assign -1to[[ImageTrackList]]'s[[selected index]].
 
- 
       
10.3. ImageDecoderInit Interface
typedef (BufferSource or ReadableStream );ImageBufferSource dictionary {ImageDecoderInit required DOMString type ;required ImageBufferSource data ;PremultiplyAlpha premultiplyAlpha = "default";ColorSpaceConversion colorSpaceConversion = "default"; [EnforceRange ]unsigned long desiredWidth ; [EnforceRange ]unsigned long desiredHeight ;boolean preferAnimation ; };
To determine if an ImageDecoderInit is a valid ImageDecoderInit,
run these steps:
- 
     If type is not a valid image MIME type, return false.
- 
     If data is of type ReadableStreamand the ReadableStream is disturbed or locked, returnfalse.
- 
     If data is of type BufferSource:- 
       If the result of running IsDetachedBuffer (described in [ECMASCRIPT]) on data is false, returnfalse.
- 
       If data is empty, return false.
 
- 
       
- 
     If desiredWidthexists anddesiredHeightdoes not exist, returnfalse.
- 
     If desiredHeightexists anddesiredWidthdoes not exist, returnfalse.
- 
     Return true.
A valid image MIME type is a string that is a valid MIME type
string and for which the type, per Section 3.1.1.1 of [RFC7231], is image.
- type, of type DOMString
- 
     String containing the MIME type of the image file to be decoded. 
- data, of type ImageBufferSource
- 
     BufferSourceorReadableStreamof bytes representing an encoded image file as described bytype.
- premultiplyAlpha, of type PremultiplyAlpha, defaulting to- "default"
- 
     Controls whether decoded outputs' color channels are to be premultiplied by their alpha channel, as defined by premultiplyAlphainImageBitmapOptions.
- colorSpaceConversion, of type ColorSpaceConversion, defaulting to- "default"
- 
     Controls whether decoded outputs' color space is converted or ignored, as defined by colorSpaceConversioninImageBitmapOptions.
- desiredWidth, of type unsigned long
- 
     Indicates a desired width for decoded outputs. Implementation is best effort; decoding to a desired width may not be supported by all formats/ decoders. 
- desiredHeight, of type unsigned long
- 
     Indicates a desired height for decoded outputs. Implementation is best effort; decoding to a desired height may not be supported by all formats/decoders. 
- preferAnimation, of type boolean
- 
     For images with multiple tracks, this indicates whether the initial track selection should prefer an animated track. NOTE: See the Get Default Selected Track Index algorithm. 
10.4. ImageDecodeOptions Interface
dictionary { [ImageDecodeOptions EnforceRange ]unsigned long frameIndex = 0;boolean completeFramesOnly =true ; };
- frameIndex, of type unsigned long, defaulting to- 0
- 
     The index of the frame to decode. 
- completeFramesOnly, of type boolean, defaulting to- true
- 
     For Progressive Images, a value of falseindicates that the decoder may output animagewith reduced detail. Each subsequent call todecode()for the sameframeIndexwill resolve to produce an image with a higher Progressive Image Frame Generation (more image detail) than the previous call, until finally the full-detail image is produced.If completeFramesOnlyis assignedtrue, or if the image is not a Progressive Image, or if the user agent does not support progressive decoding for the given image type, calls todecode()will only resolve once the full detail image is decoded.NOTE: For Progressive Images, settingcompleteFramesOnlytofalsemay be used to offer users a preview an image that is still being buffered from the network (via thedataReadableStream).Upon decoding the full detail image, the ImageDecodeResult'scompletewill be set to true.
10.5. ImageDecodeResult Interface
dictionary {ImageDecodeResult required VideoFrame image ;required boolean complete ; };
- image, of type VideoFrame
- 
     The decoded image. 
- complete, of type boolean
- 
     Indicates whether imagecontains the final full-detail output.NOTE: completeis alwaystruewhendecode()is invoked withcompleteFramesOnlyset totrue.
10.6. ImageTrackList Interface
[Exposed =(Window ,DedicatedWorker )]interface {ImageTrackList getter ImageTrack (unsigned long );index readonly attribute Promise <undefined >ready ; [EnforceRange ]readonly attribute unsigned long length ; [EnforceRange ]readonly attribute long selectedIndex ;readonly attribute ImageTrack ?selectedTrack ; };
10.6.1. Internal Slots
- [[ready promise]]
- 
     The promise used to signal when the ImageTrackListhas been populated withImageTracks.NOTE: ImageTrackframeCountmay receive subsequent updates untilcompleteistrue.
- [[track list]]
- 
     The list of ImageTracks describe by thisImageTrackList.
- [[selected index]]
- 
     The index of the selected track in [[track list]]. A value of-1indicates that no track is selected.
10.6.2. Attributes
- ready, of type Promise<undefined>, readonly
- 
     The readygetter steps are to return the[[ready promise]].
- length, of type unsigned long, readonly
- 
     The lengthgetter steps are to return the length of[[track list]].
- selectedIndex, of type long, readonly
- 
     The selectedIndexgetter steps are to return[[selected index]];
- selectedTrack, of type ImageTrack, readonly, nullable
- 
     The selectedTrackgetter steps are:- 
       If [[selected index]]is-1, returnnull.
- 
       Otherwise, return the ImageTrack from [[track list]]at the position indicated by[[selected index]].
 
- 
       
10.7. ImageTrack Interface
[Exposed =(Window ,DedicatedWorker )]interface :ImageTrack EventTarget {readonly attribute boolean animated ; [EnforceRange ]readonly attribute unsigned long frameCount ; [EnforceRange ]readonly attribute unrestricted float repetitionCount ;attribute EventHandler onchange ;attribute boolean selected ; };
10.7.1. Internal Slots
- [[ImageDecoder]]
- 
     The ImageDecoderinstance that constructed thisImageTrack.
- [[ImageTrackList]]
- 
     The ImageTrackListinstance that lists thisImageTrack.
- [[animated]]
- 
     Indicates whether this track contains an animated image with multiple frames. 
- [[frame count]]
- 
     The number of frames in this track. 
- [[repetition count]]
- 
     The number of times the animation is intended to repeat. 
- [[selected]]
- 
     Indicates whether this track is selected for decoding. 
10.7.2. Attributes
- animated, of type boolean, readonly
- 
     The animatedgetter steps are to return the value of[[animated]].NOTE: This attribute provides an early indication that frameCountwill ultimately exceed 0 for images where theframeCountstarts at0and later increments as new chunks of theReadableStreamdataarrive.
- frameCount, of type unsigned long, readonly
- 
     The frameCountgetter steps are to return the value of[[frame count]].
- repetitionCount, of type unrestricted float, readonly
- 
     The repetitionCountgetter steps are to return the value of[[repetition count]].
- onchange, of type EventHandler
- 
     An event handler IDL attribute whose event handler event type is change.
- selected, of type boolean
- 
     The selectedgetter steps are to return the value of[[selected]].The selectedsetter steps are:- 
       If [[ImageDecoder]]'s[[closed]]slot istrue, abort these steps.
- 
       Let newValue be the given value. 
- 
       If newValue equals [[selected]], abort these steps.
- 
       Assign newValue to [[selected]].
- 
       Let parentTrackList be [[ImageTrackList]]
- 
       Let oldSelectedIndex be the value of parentTrackList [[selected index]].
- 
       If oldSelectedIndex is not -1:- 
         Let oldSelectedTrack be the ImageTrackin parentTrackList[[track list]]at the position of oldSelectedIndex.
- 
         Assign falseto oldSelectedTrack[[selected]]
 
- 
         
- 
       If newValue is true, let selectedIndex be the index of thisImageTrackwithin parentTrackList’s[[track list]]. Otherwise, let selectedIndex be-1.
- 
       Assign selectedIndex to parentTrackList [[selected index]].
- 
       Run the Reset ImageDecoder algorithm on [[ImageDecoder]].
- 
       Queue a control message to [[ImageDecoder]]'s control message queue to update the internal selected track index with selectedIndex.
 Running a control message to update the internal selected track index means running these steps: - 
       Assign selectedIndex to [[internal selected track index]].
- 
       Remove all entries from [[progressive frame generations]].
 
- 
       
10.7.3. Event Summary
- change
- 
     Fired at the ImageTrackwhen theframeCountis altered.
11. Security Considerations
The primary security impact is that features of this API make it easier for an attacker to exploit vulnerabilities in the underlying platform codecs. Additionally, new abilities to configure and control the codecs may allow for new exploits that rely on a specific configuration and/or sequence of control operations.
Platform codecs are historically an internal detail of APIs like HTMLMediaElement, [WEBAUDIO], and [WebRTC]. In this way, it has always
been possible to attack the underlying codecs by using malformed media
files/streams and invoking the various API control methods.
For example, you can send any stream to a decoder by first wrapping that stream
in a media container (e.g. mp4) and setting that as the src of an HTMLMediaElement. You can then cause the underlying video decoder to
be reset() by setting a new value for <video>.currentTime.
WebCodecs makes such attacks easier by exposing low level control when inputs are provided and direct access to invoke the codec control methods. This also affords attackers the ability to invoke sequences of control methods that were not previously possible via the higher level APIs.
User agents should mitigate this risk by extensively fuzzing their implementation with random inputs and control method invocations. Additionally, user agents are encouraged to isolate their underlying codecs in processes with restricted privileges (sandbox) as a barrier against successful exploits being able to read user data.
An additional concern is exposing the underlying codecs to input mutation race conditions. Specifically, it should not be possible for a site to mutate a codec input or output while the underlying codec may still be operating on that data. This concern is mitigated by ensuring that input and output interfaces are immutable.
12. Privacy Considerations
The primary privacy impact is an increased ability to fingerprint users by querying for different codec capabilities to establish a codec feature profile. Much of this profile is already exposed by existing APIs. Such profiles are very unlikely to be uniquely identifying, but may be used with other metrics to create a fingerprint.An attacker may accumulate a codec feature profile by calling IsConfigSupported() methods with a number of different configuration
dictionaries. Similarly, an attacker may attempt to configure() a codec with
different configuration dictionaries and observe which configurations are
accepted.
Attackers may also use existing APIs to establish much of the codec feature
profile. For example, the [media-capabilities] decodingInfo() API
describes what types of decoders are supported and its powerEfficient attribute may signal when a decoder uses hardware acceleration. Similarly, the [WebRTC] getCapabilities() API may be used to determine what
types of encoders are supported and the getStats() API may
be used to determine when an encoder uses hardware acceleration. WebCodecs will
expose some additional information in the form of low level codec features.
A codec feature profile alone is unlikely to be uniquely identifying. Underlying codecs are often implemented entirely in software (be it part of the user agent binary or part of the operating system), such that all users who run that software will have a common set of capabilities. Additionally, underlying codecs are often implemented with hardware acceleration, but such hardware is mass produced and devices of a particular class and manufacture date (e.g. flagship phones manufactured in 2020) will often have common capabilities. There will be outliers (some users may run outdated versions of software codecs or use a rare mix of custom assembled hardware), but most of the time a given codec feature profile is shared by a large group of users.
Segmenting groups of users by codec feature profile still amounts to a bit of entropy that can be combined with other metrics to uniquely identify a user. User agents may partially mitigate this by returning an error whenever a site attempts to exhaustively probe for codec capabilities. Additionally, user agents may implement a "privacy budget", which depletes as authors use WebCodecs and other identifying APIs. Upon exhaustion of the privacy budget, codec capabilities could be reduced to a common baseline or prompt for user approval.