1. Definitions
- Codec
-
Refers generically to an instance of AudioDecoder, AudioEncoder, VideoDecoder, or VideoEncoder.
- Key Frame
-
An encoded frame that does not depend on any other frames for decoding.
- Internal Pending Output
-
Codec outputs such as
VideoFrame
s that currently reside in the internal pipeline of the underlying codec implementation. The underlying codec implementation may emit new outputs only when a new inputs are provided. The underlying codec implementation must emit all outputs in response to a flush. - Codec System Resources
-
Resources including CPU memory, GPU memory, and exclusive handles to specific decoding/encoding hardware that may be allocated by the User Agent as part of codec configuration or generation of
AudioFrame
andVideoFrame
objects. Such resources may be quickly exhuasted and should be released immediately when no longer in use.
2. Codec Processing Model
2.1. Background
This section is non-normative.
The codec interfaces defined by the specification are designed such that new
codec tasks may be scheduled while previous tasks are still pending. For
example, web authors may call decode()
without waiting for a previous decode()
to complete. This is achieved by offloading underlying codec tasks to
a separate thread for parallel execution.
This section describes threading behaviors as they are visible from the perspective of web authors. Implementers may choose to use more or less threads as long the exernally visible behaviors of blocking and sequencing are maintained as follows.
2.2. Control Thread and Codec Thread
All steps in this specificaiton will run on either a control thread or a codec thread.
The control thread is the thread from which authors will construct a codec and invoke its methods. Invoking a codec’s methods will typically result in the creation of control messages which are later executed on the codec thread. Each global object has a separate control thread.
The codec thread is the thread from which a codec will dequeue control messages and execute their steps. Each codec instance has a separate codec thread. The lifetime of a codec thread matches that of its associated codec instance.
The control thread uses a traditional event loop, as described in [HTML].
The codec thread uses a specialized codec processing loop.
Communication from the control thread to the codec thread is done using control message passing. Communication in the other direction is done using regular event loop tasks.
Each codec instance has a single control message queue that is a queue of control messages.
Queuing a control message means enqueing the message to a codec’s control message queue. Invoking codec methods will often queue a control message to schedule work.
Running a control message means performing a sequence of steps specified by the method that enqueued the message. The steps of a control message may depend on injected state, supplied by the method that enqueued the message.
Resetting the control message queue means performing these steps:
-
For each control message in the control message queue:
-
If a control message’s injected state includes a promise, reject that promise.
-
Remove the message from the queue.
-
The codec processing loop must run these steps:
-
While true:
-
If the control message queue is emtpy, continue.
-
Dequeue front message from the control message queue.
-
Run control message steps described by front message.
-
3. AudioDecoder Interface
[Exposed =(Window ,DedicatedWorker )]interface {
AudioDecoder constructor (AudioDecoderInit );
init readonly attribute CodecState state ;readonly attribute long decodeQueueSize ;undefined configure (AudioDecoderConfig );
config undefined decode (EncodedAudioChunk );
chunk Promise <undefined >flush ();undefined reset ();undefined close ();static Promise <AudioDecoderSupport >isConfigSupported (AudioDecoderConfig ); };
config dictionary {
AudioDecoderInit required AudioFrameOutputCallback ;
output required WebCodecsErrorCallback ; };
error callback =
AudioFrameOutputCallback undefined (AudioFrame );
output
3.1. Internal Slots
[[codec implementation]]
- Underlying decoder implementation provided by the User Agent.
[[output callback]]
- Callback given at construction for decoded outputs.
[[error callback]]
- Callback given at construction for decode errors.
3.2. Constructors
AudioDecoder(init)
-
Let d be a new
AudioDecoder
object. -
Assign init.output to the
[[output callback]]
internal slot. -
Assign init.error to the
[[error callback]]
internal slot. -
Assign "unconfigured" to d.state.
-
Return d.
3.3. Attributes
-
state
, of type CodecState, readonly - Describes the current state of the codec.
-
decodeQueueSize
, of type long, readonly - The number of pending decode requests. This number will decrease as the underlying codec is ready to accept new input.
3.4. Methods
configure(config)
-
Enqueues a control message to configure the audio decoder for decoding
chunks as described by config.
NOTE: This method will trigger a
NotSupportedError
if the user agent does not support config. Authors should first check support by callingisConfigSupported()
with config. User agents are not required to support any particular codec type or configuration.When invoked, run these steps:
-
If config is not a valid AudioDecoderConfig, throw a
TypeError
. -
If
state
is“closed”
, throw anInvalidStateError
. -
Set
state
to"configured"
. -
Queue a control message to configure the decoder with config.
Running a control message to configure the decoder means running these steps:
-
Let supported be the result of running the Check Configuration Support algorith with config.
-
If supported is
true
, assign[[codec implementation]]
with an implementation supporting config. -
Otherwise, run the Close AudioDecoder algorithm with
NotSupportedError
.
-
decode(chunk)
-
Enqueues a control message to decode the given chunk.
When invoked, run these steps:
-
If
state
is not"configured"
, throw anInvalidStateError
. -
Increment
decodeQueueSize
. -
Queue a control message to decode the chunk.
Running a control message to decode the chunk means performing these steps:
-
Attempt to use
[[codec implementation]]
to decode the chunk. -
If decoding results in an error, queue a task on the control thread event loop to run the Close VideoDecoder algorithm with
EncodingError
. -
Queue a task on the control thread event loop to decrement
decodeQueueSize
-
Let decoded outputs be a list of decoded video data outputs emitted by
[[codec implementation]]
. -
If decoded outputs is not empty, queue a task on the control thread event loop to run the Output VideoFrames algorithm with decoded outputs.
-
flush()
-
Completes all control messages in the control message queue and emits all outputs.
When invoked, run these steps:
-
If
state
is not"configured"
, return a promise rejected withInvalidStateError
DOMException
. -
Let promise be a new Promise.
-
Queue a control message to flush the codec with promise.
-
Return promise.
Running a control message to flush the codec means performing these steps with promise.
-
Signal
[[codec implementation]]
to emit all internal pending outputs. -
Let decoded outputs be a list of decoded audio data outputs emitted by
[[codec implementation]]
. -
If decoded outputs is not empty, queue a task on the control thread event loop to run the Output AudioFrames algorithm with decoded outputs.
-
Queue a task on the control thread event loop to resolve promise.
-
reset()
-
Immediately resets all state including configuration, control messages in the control message queue, and all pending
callbacks.
When invoked, run the Reset AudioDecoder algorithm.
close()
-
Immediately aborts all pending work and releases system resources.
Close is final.
When invoked, run the Close AudioDecoder algorithm.
isConfigSupported(config)
-
Returns a promise indicating whether the provided config is supported by
the user agent.
NOTE: The returned
AudioDecoderSupport
config
will contain only the dictionary members that user agent recognized. Unrecognized dictionary memebers will be ignored. Authors may detect unrecognized dictionary members by comparingingconfig
to their provided config.When invoked, run these steps:
-
If config is not a valid AudioDecoderConfig, return a promise rejected with
TypeError
. -
Let p be a new Promise.
-
Let checkSupportQueue be the result of starting a new parallel queue.
-
Enqueue the following steps to checkSupportQueue:
-
Let decoderSupport be a newly constructed
AudioDecoderSupport
, initialized as follows:-
Set
config
to the result of running the Clone Configuration algorithm with config. -
Set
supported
to the result of running the Check Configuration Support algorithm with config.
-
-
Resolve p with decoderSupport.
-
-
Return p.
-
3.5. Algorithms
- Output AudioFrames (with outputs)
-
Run these steps:
-
For each output in outputs:
-
Let buffer be an
AudioBuffer
containing the decoded audio data in output. -
Let frame be an
AudioFrame
containing buffer and a timestamp for the output. -
Invoke
[[output callback]]
with frame.
-
-
- Reset AudioDecoder
-
Run these steps:
-
If
state
is"closed"
, throw anInvalidStateError
. -
Set
state
to"unconfigured"
. -
Signal
[[codec implementation]]
to cease producing output for the previous configuration. -
Set
decodeQueueSize
to zero.
-
- Close AudioDecoder (with error)
-
Run these steps:
-
Run the Reset AudioDecoder algorithm.
-
Set
state
to"closed"
. -
Clear
[[codec implementation]]
and release associated system resources. -
If error is set, queue a task on the control thread event loop to invoke the
[[error callback]]
with error.
-
4. VideoDecoder Interface
[Exposed =(Window ,DedicatedWorker )]interface {
VideoDecoder constructor (VideoDecoderInit );
init readonly attribute CodecState state ;readonly attribute long decodeQueueSize ;undefined configure (VideoDecoderConfig );
config undefined decode (EncodedVideoChunk );
chunk Promise <undefined >flush ();undefined reset ();undefined close ();static Promise <VideoDecoderSupport >isConfigSupported (VideoDecoderConfig ); };
config dictionary {
VideoDecoderInit required VideoFrameOutputCallback ;
output required WebCodecsErrorCallback ; };
error callback =
VideoFrameOutputCallback undefined (VideoFrame );
output
4.1. Internal Slots
[[codec implementation]]
- Underlying decoder implementation provided by the User Agent.
[[output callback]]
- Callback given at construction for decoded outputs.
[[error callback]]
- Callback given at construction for decode errors.
4.2. Constructors
VideoDecoder(init)
-
Let d be a new VideoDecoder object.
-
Assign
init.output
to the[[output callback]]
internal slot. -
Assign
init.error
to the[[error callback]]
internal slot. -
Assign "unconfigured" to
d.state
. -
Return d.
4.3. Attributes
-
state
, of type CodecState, readonly - Describes the current state of the codec.
-
decodeQueueSize
, of type long, readonly - The number of pending decode requests. This number will decrease as the underlying codec is ready to accept new input.
4.4. Methods
configure(config)
-
Enqueues a control message to configure the video decoder for decoding
chunks as described by config.
NOTE: This method will trigger a
NotSupportedError
if the user agent does not support config. Authors should first check support by callingisConfigSupported()
with config. User agents are not required to support any particular codec type or configuration.When invoked, run these steps:
-
If config is not a valid VideoDecoderConfig, throw a
TypeError
. -
If
state
is“closed”
, throw anInvalidStateError
. -
Set
state
to"configured"
. -
Queue a control message to configure the decoder with config.
Running a control message to configure the decoder means running these steps:
-
Let supported be the result of running the Check Configuration Support algorith with config.
-
If supported is
true
, assign[[codec implementation]]
with an implementation supporting config. -
Otherwise, run the Close VideoDecoder algorithm with
NotSupportedError
.
-
decode(chunk)
-
Enqueues a control message to decode the given chunk.
When invoked, run these steps:
-
If
state
is not"configured"
, throw anInvalidStateError
. -
Increment
decodeQueueSize
. -
Queue a control message to decode the chunk.
Running a control message to decode the chunk means performing these steps:
-
Attempt to use
[[codec implementation]]
to decode the chunk. -
If decoding results in an error, queue a task on the control thread event loop to run the Close VideoDecoder algorithm with
EncodingError
. -
Queue a task on the control thread event loop to decrement
decodeQueueSize
-
Let decoded outputs be a list of decoded video data outputs emitted by
[[codec implementation]]
. -
If decoded outputs is not empty, queue a task on the control thread event loop to run the Output VideoFrames algorithm with decoded outputs.
-
flush()
-
Completes all control messages in the control message queue and emits all outputs.
When invoked, run these steps:
-
If
state
is not"configured"
, return a promise rejected withInvalidStateError
DOMException
. -
Let promise be a new Promise.
-
Queue a control message to flush the codec with promise.
-
Return promise.
Running a control message to flush the codec means performing these steps with promise.
-
Signal
[[codec implementation]]
to emit all internal pending outputs. -
Let decoded outputs be a list of decoded video data outputs emitted by
[[codec implementation]]
. -
If decoded outputs is not empty, queue a task on the control thread event loop to run the Output VideoFrames algorithm with decoded outputs.
-
Queue a task on the control thread event loop to resolve promise.
-
reset()
-
Immediately resets all state including configuration, control messages in the control message queue, and all pending
callbacks.
When invoked, run the Reset VideoDecoder algorithm.
close()
-
Immediately aborts all pending work and releases system resources.
Close is final.
When invoked, run the Close VideoDecoder algorithm.
isConfigSupported(config)
-
Returns a promise indicating whether the provided config is supported by
the user agent.
NOTE: The returned
VideoDecoderSupport
config
will contain only the dictionary members that user agent recognized. Unrecognized dictionary memebers will be ignored. Authors may detect unrecognized dictionary members by comparingingconfig
to their provided config.When invoked, run these steps:
-
If config is not a valid VideoDecoderConfig, return a promise rejected with
TypeError
. -
Let p be a new Promise.
-
Let checkSupportQueue be the result of starting a new parallel queue.
-
Enqueue the following steps to checkSupportQueue:
-
Let decoderSupport be a newly constructed
VideoDecoderSupport
, initialized as follows:-
Set
config
to the result of running the Clone Configuration algorithm with config. -
Set
supported
to the result of running the Check Configuration Support algorithm with config.
-
-
Resolve p with decoderSupport.
-
-
Return p.
-
4.5. Algorithms
- Output VideoFrames (with outputs)
-
Run these steps:
-
For each output in outputs:
-
Let planes be a sequence of
Plane
s containing the decoded video frame data from output. -
Let pixelFormat be the
PixelFormat
of planes. -
Let frameInit be a
VideoFrameInit
with the following keys:-
Let
timestamp
andduration
be thetimestamp
andduration
from theEncodedVideoChunk
associated with output. -
Let
codedWidth
andcodedHeight
be the width and height of the decoded video frame output in pixels, prior to any cropping or aspect ratio adjustments. -
Let
cropLeft
,cropTop
,cropWidth
, andcropHeight
be the crop region of the decoded video frame output in pixels, prior to any aspect ratio adjustments. -
Let
displayWidth
anddisplayHeight
be the display size of the decoded video frame in pixels.
-
-
Let frame be a
VideoFrame
, constructed with pixelFormat, planes, and frameInit. -
Invoke
[[output callback]]
with frame.
-
-
- Reset VideoDecoder
-
Run these steps:
-
If
state
is"closed"
, throw anInvalidStateError
. -
Set
state
to"unconfigured"
. -
Signal
[[codec implementation]]
to cease producing output for the previous configuration. -
Set
decodeQueueSize
to zero.
-
- Close VideoDecoder (with error)
-
Run these steps:
-
Run the Reset VideoDecoder algorithm.
-
Set
state
to"closed"
. -
Clear
[[codec implementation]]
and release associated system resources. -
If error is set, queue a task on the control thread event loop to invoke the
[[error callback]]
with error.
-
5. AudioEncoder Interface
[Exposed =(Window ,DedicatedWorker )]interface {
AudioEncoder constructor (AudioEncoderInit );
init readonly attribute CodecState state ;readonly attribute long encodeQueueSize ;undefined configure (AudioEncoderConfig );
config undefined encode (AudioFrame );
frame Promise <undefined >flush ();undefined reset ();undefined close ();static Promise <AudioEncoderSupport >isConfigSupported (AudioEncoderConfig ); };
config dictionary {
AudioEncoderInit required EncodedAudioChunkOutputCallback ;
output required WebCodecsErrorCallback ; };
error callback =
EncodedAudioChunkOutputCallback undefined (EncodedAudioChunk );
output
5.1. Internal Slots
[[codec implementation]]
- Underlying encoder implementation provided by the User Agent.
[[output callback]]
- Callback given at construction for encoded outputs.
[[error callback]]
- Callback given at construction for encode errors.
5.2. Constructors
AudioEncoder(init)
-
Let e be a new AudioEncoder object.
-
Assign
init.output
to the[[output callback]]
internal slot. -
Assign
init.error
to the[[error callback]]
internal slot. -
Assign "unconfigured" to
e.state
. -
Return e.
5.3. Attributes
-
state
, of type CodecState, readonly - Describes the current state of the codec.
-
encodeQueueSize
, of type long, readonly - The number of pending encode requests. This number will decrease as the underlying codec is ready to accept new input.
5.4. Methods
configure(config)
-
Enqueues a control message to configure the audio encoder for
decoding chunks as described by config.
NOTE: This method will trigger a
NotSupportedError
if the user agent does not support config. Authors should first check support by callingisConfigSupported()
with config. User agents are not required to support any particular codec type or configuration.When invoked, run these steps:
-
If config is not a valid AudioEncoderConfig, throw a
TypeError
. -
If
state
is"closed"
, throw anInvalidStateError
. -
Set
state
to"configured"
. -
Queue a control message to configure the encoder using config.
Running a control message to configure the encoder means performing these steps:
-
Let supported be the result of running the Check Configuration Support algorith with config.
-
If supported is
true
, assign[[codec implementation]]
with an implementation supporting config. -
Otherwise, run the Close AudioEncoder algorithm with
NotSupportedError
.
-
encode(frame)
-
Enqueues a control message to encode the given frame.
When invoked, run these steps:
-
If the value of frame’s
[[detached]]
internal slot istrue
, throw aTypeError
. -
If
state
is not"configured"
, throw anInvalidStateError
. -
Let frameClone hold the result of running the Clone Frame algorithm with frame.
-
Increment
encodeQueueSize
. -
Queue a control message to encode frameClone.
Running a control message to encode the frame means performing these steps.
-
Attempt to use
[[codec implementation]]
to encode frameClone. -
If encoding results in an error, queue a task on the control thread event loop to run the Close AudioEncoder algorithm with
EncodingError
. -
Queue a task on the control thread event loop to decrement
encodeQueueSize
. -
Let encoded outputs be a list of encoded audio data outputs emitted by
[[codec implementation]]
. -
If encoded outputs is not empty, queue a task on the control thread event loop to run the Output EncodedAudioChunks algorithm with encoded outputs.
-
flush()
-
Completes all control messages in the control message queue and emits all outputs.
When invoked, run these steps:
-
If
state
is not"configured"
, return a promise rejected withInvalidStateError
DOMException
. -
Let promise be a new Promise.
-
Queue a control message to flush the codec with promise.
-
Return promise.
Running a control message to flush the codec means performing these steps with promise.
-
Signal
[[codec implementation]]
to emit all internal pending outputs. -
Let encoded outputs be a list of encoded audio data outputs emitted by
[[codec implementation]]
. -
If encoded outputs is not empty, queue a task on the control thread event loop to run the Output EncodedAudioChunks algorithm with encoded outputs.
-
Queue a task on the control thread event loop to resolve promise.
-
reset()
-
Immediately resets all state including configuration, control messages in the control message queue, and all pending
callbacks.
When invoked, run the Reset AudioEncoder algorithm.
close()
-
Immediately aborts all pending work and releases system resources.
Close is final.
When invoked, run the Close AudioEncoder algorithm.
isConfigSupported(config)
-
Returns a promise indicating whether the provided config is supported by
the user agent.
NOTE: The returned
AudioEncoderSupport
config
will contain only the dictionary members that user agent recognized. Unrecognized dictionary memebers will be ignored. Authors may detect unrecognized dictionary members by comparingingconfig
to their provided config.When invoked, run these steps:
-
If config is not a valid AudioEncoderConfig, return a promise rejected with
TypeError
. -
Let p be a new Promise.
-
Let checkSupportQueue be the result of starting a new parallel queue.
-
Enqueue the following steps to checkSupportQueue:
-
Let encoderSupport be a newly constructed
AudioEncoderSupport
, initialized as follows:-
Set
config
to the result of running the Clone Configuration algorithm with config. -
Set
supported
to the result of running the Check Configuration Support algorithm with config.
-
-
Resolve p with encoderSupport.
-
-
Return p.
-
5.5. Algorithms
- Output EncodedAudioChunks (with outputs)
-
Run these steps:
-
For each output in outputs:
-
Let chunkInit be an
EncodedAudioChunkInit
with the following keys:-
Let
data
contain the encoded audio data from output. -
Let
type
be theEncodedAudioChunkType
of output. -
Let
timestamp
be thetimestamp
from the AudioFrame associated with output.
-
-
Let chunk be a new
EncodedAudioChunk
constructed with chunkInit. -
Invoke
[[output callback]]
with chunk.
-
-
- Reset AudioEncoder
-
Run these steps:
-
If
state
is"closed"
, throw anInvalidStateError
. -
Set
state
to"unconfigured"
. -
Signal
[[codec implementation]]
to cease producing output for the previous configuration. -
Set
encodeQueueSize
to zero.
-
- Close AudioEncoder (with error)
-
Run these steps:
-
Run the Reset AudioEncoder algorithm.
-
Set
state
to"closed"
. -
Clear
[[codec implementation]]
and release associated system resources. -
If error is set, queue a task on the control thread event loop invoke the
[[error callback]]
with error.
-
6. VideoEncoder Interface
[Exposed =(Window ,DedicatedWorker )]interface {
VideoEncoder constructor (VideoEncoderInit );
init readonly attribute CodecState state ;readonly attribute long encodeQueueSize ;undefined configure (VideoEncoderConfig );
config undefined encode (VideoFrame ,
frame optional VideoEncoderEncodeOptions = {});
options Promise <undefined >flush ();undefined reset ();undefined close ();static Promise <boolean >isConfigSupported (VideoEncoderConfig ); };
config dictionary {
VideoEncoderInit required EncodedVideoChunkOutputCallback ;
output required WebCodecsErrorCallback ; };
error callback =
EncodedVideoChunkOutputCallback undefined (EncodedVideoChunk ,
output VideoDecoderConfig ?);
output_config
6.1. Internal Slots
[[codec implementation]]
- Underlying encoder implementation provided by the User Agent.
[[output callback]]
- Callback given at construction for encoded outputs.
[[error callback]]
- Callback given at construction for encode errors.
[[active encoder config]]
- The
VideoEncoderConfig
that is actively applied. [[active output config]]
- The
VideoDecoderConfig
that describes how to decode the most recently emittedEncodedVideoChunk
.
6.2. Constructors
VideoEncoder(init)
-
Let e be a new VideoEncoder object.
-
Assign
init.output
to the[[output callback]]
internal slot. -
Assign
init.error
to the[[error callback]]
internal slot. -
Assign "unconfigured" to
e.state
. -
Return e.
6.3. Attributes
-
state
, of type CodecState, readonly - Describes the current state of the codec.
-
encodeQueueSize
, of type long, readonly - The number of pending encode requests. This number will decrease as the underlying codec is ready to accept new input.
6.4. Methods
configure(config)
-
Enqueues a control message to configure the video encoder for
decoding chunks as described by config.
NOTE: This method will trigger a
NotSupportedError
if the user agent does not support config. Authors should first check support by callingisConfigSupported()
with config. User agents are not required to support any particular codec type or configuration.When invoked, run these steps:
-
If config is not a valid VideoEncoderConfig, throw a
TypeError
. -
If
state
is"closed"
, throw anInvalidStateError
. -
Set
state
to"configured"
. -
Queue a control message to configure the encoder using config.
Running a control message to configure the encoder means performing these steps:
-
Let supported be the result of running the Check Configuration Support algorith with config.
-
If supported is
true
, assign[[codec implementation]]
with an implementation supporting config. -
Otherwise, run the Close VideoEncoder algorithm with
NotSupportedError
and abort these steps. -
Set
[[active encoder config]]
toconfig
.
-
encode(frame, options)
-
Enqueues a control message to encode the given frame.
When invoked, run these steps:
-
If the value of frame’s
[[detached]]
internal slot istrue
, throw aTypeError
. -
If
state
is not"configured"
, throw anInvalidStateError
. -
Let frameClone hold the result of running the Clone Frame algorithm with frame.
-
Increment
encodeQueueSize
. -
Queue a control message to encode frameClone.
Running a control message to encode the frame means performing these steps.
-
Attempt to use
[[codec implementation]]
to encode frameClone according to options. -
If encoding results in an error, queue a task on the control thread event loop to run the Close VideoEncoder algorithm with
EncodingError
. -
Queue a task on the control thread event loop to decrement
encodeQueueSize
. -
Let encoded outputs be a list of encoded video data outputs emitted by
[[codec implementation]]
. -
If encoded outputs is not empty, queue a task on the control thread event loop to run the Output EncodedVideoChunks algorithm with encoded outputs.
-
flush()
-
Completes all control messages in the control message queue and emits all outputs.
When invoked, run these steps:
-
If
state
is not"configured"
, return a promise rejected withInvalidStateError
DOMException
. -
Let promise be a new Promise.
-
Queue a control message to flush the codec with promise.
-
Return promise.
Running a control message to flush the codec means performing these steps with promise.
-
Signal
[[codec implementation]]
to emit all internal pending outputs. -
Let encoded outputs be a list of encoded video data outputs emitted by
[[codec implementation]]
. -
If encoded outputs is not empty, queue a task on the control thread event loop to run the Output EncodedVideoChunks algorithm with encoded outputs.
-
Queue a task on the control thread event loop to resolve promise.
-
reset()
-
Immediately resets all state including configuration, control messages in the control message queue, and all pending
callbacks.
When invoked, run the Reset VideoEncoder algorithm.
close()
-
Immediately aborts all pending work and releases system resources.
Close is final.
When invoked, run the Close VideoEncoder algorithm.
isConfigSupported(config)
-
Returns a promise indicating whether the provided config is supported by
the user agent.
NOTE: The returned
VideoEncoderSupport
config
will contain only the dictionary members that user agent recognized. Unrecognized dictionary memebers will be ignored. Authors may detect unrecognized dictionary members by comparingingconfig
to their provided config.When invoked, run these steps:
-
If config is not a valid VideoEncoderConfig, return a promise rejected with
TypeError
. -
Let p be a new Promise.
-
Let checkSupportQueue be the result of starting a new parallel queue.
-
Enqueue the following steps to checkSupportQueue:
-
Let encoderSupport be a newly constructed
VideoEncoderSupport
, initialized as follows:-
Set
config
to the result of running the Clone Configuration algorithm with config. -
Set
supported
to the result of running the Check Configuration Support algorithm with config.
-
-
Resolve p with encoderSupport.
-
-
Return p.
-
6.5. Algorithms
- Output EncodedVideoChunks (with outputs)
-
Run these steps:
-
For each output in outputs:
-
Let encoder_config be the
[[active encoder config]]
.The intent is for encoder_config to be the
[[active encoder config]]
that was used to encode output. But, as written, it may occur that output was encoded using a previousVideoEncoderConfig
that has since been replaced by a later call toconfigure()
. See #138. -
Let output_config be a
VideoDecoderConfig
that describes output. Initialize output_config as follows:-
Assign
encoder_config.codec
tooutput_config.codec
. -
Assign
encoder_config.width
tooutput_config.cropWidth
. -
Assign
encoder_config.height
tooutput_config.cropHeight
. -
Assign
encoder_config.displayWidth
tooutput_config.displayWidth
. -
Assign
encoder_config.displayHeight
tooutput_config.displayHeight
. -
Assign the remaining keys of
output_config
as determined by[[codec implementation]]
. The user agent must ensure that the configuration is completely described such that output_config could be used to correctly decode output.NOTE: This includes supplying the
description
to describe codec specific "extradata", the use of which may be further described in codec registrations listed in the [WEBCODECS-CODEC-REGISTRY].
-
-
If output_config and
[[active output config]]
are equal dictionaries, set output_config to null. Otherwise, set[[active output config]]
to output_config.NOTE: The
VideoDecoderConfig
output_config will benull
if the configuration hasn’t changed from previous outputs. The first output will always include a non-null output_config. -
Let chunkInit be an
EncodedVideoChunkInit
with the following keys:-
Let
data
contain the encoded video data from output. -
Let
type
be theEncodedVideoChunkType
of output. -
Let
timestamp
be the[[timestamp]]
from theVideoFrame
associated with output. -
Let
duration
be the[[duration]]
from theVideoFrame
associated with output.
-
-
Let chunk be a new
EncodedVideoChunk
constructed with chunkInit. -
Invoke
[[output callback]]
with chunk.
-
-
- Reset VideoEncoder
-
Run these steps:
-
If
state
is"closed"
, throw anInvalidStateError
. -
Set
state
to"unconfigured"
. -
Set
[[active encoder config]]
tonull
. -
Set
[[active output config]]
tonull
. -
Signal
[[codec implementation]]
to cease producing output for the previous configuration. -
Set
encodeQueueSize
to zero.
-
- Close VideoEncoder (with error)
-
Run these steps:
-
Run the Reset VideoEncoder algorithm.
-
Set
state
to"closed"
. -
Clear
[[codec implementation]]
and release associated system resources. -
If error is set, queue a task on the control thread event loop invoke the
[[error callback]]
with error.
-
7. Configurations
7.1. Check Configuration Support (with config)
Run these steps:-
If the user agent can provide a codec to support all entries of the config, including applicable default values for keys that are not included, return
true
.NOTE: The types
AudioDecoderConfig
,VideoDecoderConfig
,AudioEncoderConfig
, andVideoEncoderConfig
each define their respective configuration entries and defaults.NOTE: Support for a given configuration may change dynamically if the hardware is altered (e.g. external GPU unplugged) or if required hardware resources are exhausted. User agents should describe support on a best-effort basis given the resources that are available at the time of the query.
-
Otherwise, return false.
7.2. Clone Configuration (with config)
NOTE: This algorithm will copy only the dictionary members that the user agent recognizes as part of the dictionary type.
Run these steps:
-
Let dictType be the type of dictionary config.
-
Let clone be a new empty instance of dictType.
-
For each dictionary member m defined on dictType:
-
If
config[m]
is a nested dictionary, setclone[m]
to the result of recursively running the Clone Configuration algorithm withconfig[m]
. -
Otherwise, assign the value of
config[m]
toclone[m]
.
7.3. Signalling Configuration Support
7.3.1. AudioDecoderSupport
dictionary {
AudioDecoderSupport boolean supported ;AudioDecoderConfig config ; };
supported
, of type boolean- A boolean indicating the whether the corresponding
config
is supported by the user agent. config
, of type AudioDecoderConfig- An
AudioDecoderConfig
used by the user agent in determining the value ofsupported
.
7.3.2. VideoDecoderSupport
dictionary {
VideoDecoderSupport boolean supported ;VideoDecoderConfig config ; };
supported
, of type boolean- A boolean indicating the whether the corresponding
config
is supported by the user agent. config
, of type VideoDecoderConfig- A
VideoDecoderConfig
used by the user agent in determining the value ofsupported
.
7.3.3. AudioEncoderSupport
dictionary {
AudioEncoderSupport boolean supported ;AudioEncoderConfig config ; };
supported
, of type boolean- A boolean indicating the whether the corresponding
config
is supported by the user agent. config
, of type AudioEncoderConfig- An
AudioEncoderConfig
used by the user agent in determining the value ofsupported
.
7.3.4. VideoEncoderSupport
dictionary {
VideoEncoderSupport boolean supported ;VideoEncoderConfig config ; };
supported
, of type boolean- A boolean indicating the whether the corresponding
config
is supported by the user agent. config
, of type VideoEncoderConfig- A
VideoEncoderConfig
used by the user agent in determining the value ofsupported
.
7.4. Codec String
A codec string describes a given codec format to be used for encoding or decoding.A valid codec string must meet the following conditions.
-
Is valid per the relevant codec specification (see examples below).
-
It describes a single codec.
-
It is unambiguous about codec profile and level for codecs that define these concepts.
NOTE: In other media specifications, codec strings historically accompanied a MIME type as the "codecs=" parameter
(isTypeSupported()
, canPlayType()
) [RFC6381]. In this specification, encoded media is not containerized;
hence, only the value of the codecs parameter is accepted.
The format and semantics for codec strings are defined by codec registrations listed in the [WEBCODECS-CODEC-REGISTRY]. A compliant implementation may support any combination of codec registrations or none at all.
7.5. AudioDecoderConfig
dictionary {
AudioDecoderConfig required DOMString codec ;required unsigned long sampleRate ;required unsigned long numberOfChannels ;BufferSource description ; };
To check if an AudioDecoderConfig
is a valid AudioDecoderConfig,
run these steps:
-
If codec is not a valid codec string, return
false
. -
Return
true
.
codec
, of type DOMString- Contains a codec string describing the codec.
sampleRate
, of type unsigned long- The number of frame samples per second.
numberOfChannels
, of type unsigned long- The number of audio channels.
description
, of type BufferSource-
A sequence of codec specific bytes, commonly known as extradata.
NOTE: The registrations in the [WEBCODECS-CODEC-REGISTRY] describe whether/how to populate this sequence, corresponding to the provided
codec
.
7.6. VideoDecoderConfig
dictionary {
VideoDecoderConfig required DOMString codec ;BufferSource description ;unsigned long codedWidth ;unsigned long codedHeight ;unsigned long cropLeft ;unsigned long cropTop ;unsigned long cropWidth ;unsigned long cropHeight ;unsigned long displayWidth ;unsigned long displayHeight ;HardwareAcceleration hardwareAcceleration = "allow"; };
To check if a VideoDecoderConfig
is a valid VideoDecoderConfig,
run these steps:
-
If
codec
is not a valid codec string, returnfalse
. -
If
codedWidth
= 0 orcodedHeight
= 0, returnfalse
. -
If
cropWidth
= 0 orcropHeight
= 0, returnfalse
. -
If
cropTop
+cropHeight
>=codedHeight
, returnfalse
. -
If
cropLeft
+cropWidth
>=codedWidth
, returnfalse
. -
If
displayWidth
= 0 ordisplayHeight
= 0, returnfalse
. -
Return
true
.
codec
, of type DOMString- Contains a codec string describing the codec.
description
, of type BufferSource-
A sequence of codec specific bytes, commonly known as extradata.
NOTE: The registrations in the [WEBCODECS-CODEC-REGISTRY] may describe whether/how to populate this sequence, corresponding to the provided
codec
. codedWidth
, of type unsigned long- Width of the VideoFrame in pixels, prior to any cropping or aspect ratio adjustments.
codedHeight
, of type unsigned long- Height of the VideoFrame in pixels, prior to any cropping or aspect ratio adjustments.
cropLeft
, of type unsigned long- The number of pixels to remove from the left of the VideoFrame, prior to aspect ratio adjustments. Defaults to zero if not present.
cropTop
, of type unsigned long- The number of pixels to remove from the top of the VideoFrame, prior to aspect ratio adjustments. Defaults to zero if not present.
cropWidth
, of type unsigned long- The width in pixels to include in the crop, starting from cropLeft. Defaults to codedWidth if not present.
cropHeight
, of type unsigned long- The height in pixels to include in the crop, starting from cropLeft. Defaults to codedHeight if not present.
displayWidth
, of type unsigned long- Width of the VideoFrame when displayed. Defaults to cropWidth if not present.
displayHeight
, of type unsigned long- Height of the VideoFrame when displayed. Defaults to cropHeight if not present.
hardwareAcceleration
, of type HardwareAcceleration, defaulting to"allow"
- Configures hardware acceleration for this codec. See
HardwareAcceleration
.
7.7. AudioEncoderConfig
dictionary {
AudioEncoderConfig required DOMString codec ;unsigned long sampleRate ;unsigned long numberOfChannels ;unsigned long long bitrate ; };
NOTE: Codec-specific extensions to AudioEncoderConfig
may be defined by the
registrations in the [WEBCODECS-CODEC-REGISTRY].
To check if an AudioEncoderConfig
is a valid AudioEncoderConfig,
run these steps:
-
If
codec
is not a valid codec string, returnfalse
. -
Return
true
.
codec
, of type DOMString- Contains a codec string describing the codec.
sampleRate
, of type unsigned long- The number of frame samples per second.
numberOfChannels
, of type unsigned long- The number of audio channels.
bitrate
, of type unsigned long long- The average bitrate of the encoded audio given in units of bits per second.
7.8. VideoEncoderConfig
dictionary {
VideoEncoderConfig required DOMString codec ;unsigned long long bitrate ;required unsigned long width ;required unsigned long height ;unsigned long displayWidth ;unsigned long displayHeight ;HardwareAcceleration hardwareAcceleration = "allow"; };
NOTE: Codec-specific extensions to VideoEncoderConfig
may be defined by the
registrations in the [WEBCODECS-CODEC-REGISTRY].
To check if a VideoEncoderConfig
is a valid VideoEncoderConfig,
run these steps:
-
If
codec
is not a valid codec string, returnfalse
. -
If
displayWidth
= 0 ordisplayHeight
= 0, returnfalse
. -
Return
true
.
codec
, of type DOMString- Contains a codec string describing the codec.
bitrate
, of type unsigned long long- The average bitrate of the encoded video given in units of bits per second.
width
, of type unsigned long-
The encoded width of output
EncodedVideoChunk
s in pixels, prior to any display aspect ratio adjustments.The encoder must scale any
VideoFrame
who’s[[crop width]]
differs from this value. height
, of type unsigned long-
The encoded height of output
EncodedVideoChunk
s in pixels, prior to any display aspect ratio adjustments.The encoder must scale any
VideoFrame
who’s[[crop height]]
differs from this value.
displayWidth
, of type unsigned long- The intended display width of output
EncodedVideoChunk
s in pixels. Defaults towidth
if not present. displayHeight
, of type unsigned long- The intended display height of output
EncodedVideoChunk
s in pixels. Defaults towidth
if not present.
displayWidth
or displayHeight
that differs from width
and height
signals
that chunks should be scaled after decoding to arrive at the final
display aspect ratio.
For many codecs this is merely pass-through information, but some codecs may optionally include display sizing in the bitstream.
hardwareAcceleration
, of type HardwareAcceleration, defaulting to"allow"
- Configures hardware acceleration for this codec. See
HardwareAcceleration
.
7.9. Hardware Acceleration
enum {
HardwareAcceleration "allow" ,"deny" ,"require" , };
When supported, hardware acceleration offloads encoding or decoding to specialized hardware.
allow
. This gives the user agent flexibility to
optimize based on its knowledge of the system and configuration. A common
strategy will be to prioritize hardware acceleration at higher resolutions
with a fallback to software codecs if hardware acceleration fails.
Authors should carefully weigh the tradeoffs setting a hardware acceleration preference. The precise trade-offs will be device-specific, but authors should generally expect the following:
-
Setting a value of
require
may significantly restrict what configurations are supported. It may occur that the user’s device does not offer acceleration for any codec, or only for the most common profiles of older codecs. -
Hardware acceleration does not simply imply faster encoding / decoding. Hardware acceleration often has higher startup latency but more consistent throughput performance. Acceleration will generally reduce CPU load.
-
For decoding, hardware acceleration is often less robust to inputs that are mislabeled or violate the relevant codec specification.
-
Hardware acceleration will often be more power efficient than purely software based codecs.
-
For lower resolution content, the overhead added by hardware acceleration may yield decreased performance and power efficiency compared to purely software based codecs.
Given these tradeoffs, a good example of using "require" would be if an author intends to provide their own software based fallback via WebAssembly.
Alternatively, a good example of using "disallow" would be if an author is especially sensitive to the higher startup latency or decreased robustness generally associated with hardware acceleration.
allow
- Indicates that the user agent may use hardware acceleration if it is available and compatible with other aspects of the codec configuration.
deny
-
Indicates that the user agent must not use hardware acceleration.
NOTE: This will cause the configuration to be unsupported on platforms where an unaccelerated codec is unavailable or is incompatible with other aspects of the codec configuration.
require
-
Indicates that the user agent must use hardware acceleration.
NOTE: This will cause the configuration to be unsupported on platforms where an accelerated codec is unavailable or is incompatible with other aspects of the codec configuration.
7.10. Configuration Equivalence
Two dictionaries are equal dictionaries if they contain the same keys and values. For nested dictionaries, apply this definition recursively.7.11. VideoEncoderEncodeOptions
dictionary {
VideoEncoderEncodeOptions boolean keyFrame =false ; };
keyFrame
, of type boolean, defaulting tofalse
- A value of
true
indicates that the given frame MUST be encoded as a key frame. A value offalse
indicates that the user agent has flexibility to decide whether the frame will be encoded as a key frame.
7.12. CodecState
enum {
CodecState "unconfigured" ,"configured" ,"closed" };
unconfigured
- The codec is not configured for encoding or decoding.
configured
- A valid configuration has been provided. The codec is ready for encoding or decoding.
closed
- The codec is no longer usable and underlying system resources have been released.
7.13. WebCodecsErrorCallback
callback =
WebCodecsErrorCallback undefined (DOMException );
error
8. Encoded Media Interfaces (Chunks)
These interfaces represent chunks of encoded media.8.1. EncodedAudioChunk Interface
[Exposed =(Window ,DedicatedWorker )]interface {
EncodedAudioChunk constructor (EncodedAudioChunkInit );
init readonly attribute EncodedAudioChunkType type ;readonly attribute unsigned long long timestamp ; // microsecondsreadonly attribute ArrayBuffer data ; };dictionary {
EncodedAudioChunkInit required EncodedAudioChunkType ;
type required unsigned long long ;
timestamp required BufferSource ; };
data enum {
EncodedAudioChunkType ,
"key" , };
"delta"
8.1.1. Constructors
EncodedAudioChunk(init)
-
Let chunk be a new
EncodedAudioChunk
object, initialized as follows-
Assign
init.type
tochunk.type
. -
Assign
init.timestamp
tochunk.timestamp
. -
Assign a copy of
init.data
tochunk.data
.
-
-
Return chunk.
8.1.2. Attributes
type
, of type EncodedAudioChunkType, readonly- Describes whether the chunk is a key frame.
timestamp
, of type unsigned long long, readonly- The presentation timestamp, given in microseconds.
data
, of type ArrayBuffer, readonly- A sequence of bytes containing encoded audio data.
8.2. EncodedVideoChunk Interface
[Exposed =(Window ,DedicatedWorker )]interface {
EncodedVideoChunk constructor (EncodedVideoChunkInit );
init readonly attribute EncodedVideoChunkType type ;readonly attribute unsigned long long timestamp ; // microsecondsreadonly attribute unsigned long long ?duration ; // microsecondsreadonly attribute ArrayBuffer data ; };dictionary {
EncodedVideoChunkInit required EncodedVideoChunkType ;
type required unsigned long long ;
timestamp unsigned long long ;
duration required BufferSource ; };
data enum {
EncodedVideoChunkType ,
"key" , };
"delta"
8.2.1. Constructors
EncodedVideoChunk(init)
-
Let chunk be a new
EncodedVideoChunk
object, initialized as follows-
Assign
init.type
tochunk.type
. -
Assign
init.timestamp
tochunk.timestamp
. -
If duration is present in init, assign
init.duration
tochunk.duration
. Otherwise, assign null tochunk.duration
.
-
-
Assign a copy of
init.data
tochunk.data
. -
Return chunk.
8.2.2. Attributes
type
, of type EncodedVideoChunkType, readonly- Describes whether the chunk is a key frame or not.
timestamp
, of type unsigned long long, readonly- The presentation timestamp, given in microseconds.
duration
, of type unsigned long long, readonly, nullable- The presentation duration, given in microseconds.
data
, of type ArrayBuffer, readonly- A sequence of bytes containing encoded video data.
9. Raw Media Interfaces (Frames)
These interfaces represent unencoded (raw) media.9.1. AudioFrame Interface
[Exposed =(Window ,DedicatedWorker )]interface {
AudioFrame constructor (AudioFrameInit );
init readonly attribute unsigned long long timestamp ;readonly attribute AudioBuffer ?buffer ;undefined close (); };dictionary {
AudioFrameInit required unsigned long long ;
timestamp required AudioBuffer ; };
buffer
9.1.1. Internal Slots
[[detached]]
- Boolean indicating whether close() was invoked and underlying resources have been released.
9.1.2. Constructors
AudioFrame(init)
-
Let frame be a new
AudioFrame
object. -
Assign
init.timestamp
toframe.timestamp
. -
Assign
init.buffer
toframe.buffer
. -
Assign
false
to the[[detached]]
internal slot. -
Return frame.
9.1.3. Attributes
timestamp
, of type unsigned long long, readonly- The presentation timestamp, given in microseconds.
buffer
, of type AudioBuffer, readonly, nullable- The buffer containing decoded audio data.
9.1.4. Methods
close()
-
Immediately frees system resources. When invoked, run these steps:
-
Release system resources for buffer and set its value to null.
-
Assign
true
to the[[detached]]
internal slot.
NOTE: This section needs work. We should use the name and semantics of VideoFrame destroy(). Similarly, we should add clone() to make a deep copy.
-
9.2. VideoFrame Interface
[Exposed =(Window ,DedicatedWorker )]interface {
VideoFrame constructor (ImageBitmap ,
imageBitmap optional VideoFrameInit = {});
frameInit constructor (PixelFormat ,
pixelFormat sequence <(Plane or PlaneInit )>,
planes optional VideoFrameInit = {});
frameInit readonly attribute PixelFormat format ;readonly attribute FrozenArray <Plane >planes ;readonly attribute unsigned long codedWidth ;readonly attribute unsigned long codedHeight ;readonly attribute unsigned long cropLeft ;readonly attribute unsigned long cropTop ;readonly attribute unsigned long cropWidth ;readonly attribute unsigned long cropHeight ;readonly attribute unsigned long displayWidth ;readonly attribute unsigned long displayHeight ;readonly attribute unsigned long long ?duration ;readonly attribute unsigned long long ?timestamp ;undefined destroy ();VideoFrame clone ();Promise <ImageBitmap >createImageBitmap (optional ImageBitmapOptions = {}); };
options dictionary {
VideoFrameInit unsigned long ;
codedWidth unsigned long ;
codedHeight unsigned long ;
cropLeft unsigned long ;
cropTop unsigned long ;
cropWidth unsigned long ;
cropHeight unsigned long ;
displayWidth unsigned long ;
displayHeight unsigned long long ;
duration unsigned long long ; };
timestamp
9.2.1. Internal Slots
[[detached]]
-
A boolean indicating whether
destroy()
was invoked and underlying resources have been released. [[format]]
-
A
PixelFormat
describing the pixel format of theVideoFrame
. [[planes]]
-
A list of
Plane
s describing the memory layout of the pixel data inVideoFrame
. The number ofPlane
s and their semantics are determined by[[format]]
. [[coded width]]
-
Width of the
VideoFrame
in pixels, prior to any cropping or aspect ratio adjustments. [[coded height]]
-
Height of the
VideoFrame
in pixels, prior to any cropping or aspect ratio adjustments. [[crop left]]
-
The number of pixels to remove from the left of the
VideoFrame
, prior to aspect ratio adjustments. [[crop top]]
-
The number of pixels to remove from the top of the
VideoFrame
, prior to aspect ratio adjustments. [[crop width]]
-
The width of pixels to include in the crop, starting from cropLeft.
[[crop height]]
-
The height of pixels to include in the crop, starting from cropLeft.
[[display width]]
-
Width of the
VideoFrame
when displayed after applying aspect ratio adjustments. [[display height]]
-
Height of the
VideoFrame
when displayed after applying aspect ratio adjustments. [[duration]]
-
The presentation duration, given in microseconds. The duration is copied from the
EncodedVideoChunk
corresponding to thisVideoFrame
. [[timestamp]]
-
The presentation timestamp, given in microseconds. The timestamp is copied from the
EncodedVideoChunk
corresponding to thisVideoFrame
.
9.2.2. Constructors
NOTE: this section needs work. Current wording assumes a VideoFrame can always
be easily represented using one of the known pixel formats. In practice, the
underlying UA resources may be GPU backed or formatted in such a way that
conversion to an allowed pixel format requires expensive copies and
translation. When this occurs, we should allow planes to be null and format
to be "opaque" to avoid early optimization. We should make conversion
explicit and user controlled by offering a videoFrame.convertTo(format)
that returns a Promise containing a new VideoFrame for which the
copies/translations are performed.
VideoFrame(imageBitmap, frameInit)
-
If frameInit is not a valid VideoFrameInit, throw a
TypeError
. -
If the value of imageBitmap’s'
[[Detached]]
internal slot is set totrue
, then throw anInvalidStateError
DOMException. -
Let frame be a new
VideoFrame
. -
Assign
false
to frame’s[[detached]]
internal slot. -
Use a copy of the pixel data in imageBitmap to initialize to following frame internal slots:
-
Initialize
[[format]]
be the underlying format of imageBitmap. -
Initialize
[[planes]]
to describe the arrangement of memory of the copied pixel data. -
Assign regions of the copied pixel data to the
[[plane buffer]]
internal slot of each plane as appropriate for the pixel format. -
Initialize
[[coded width]]
and[[coded height]]
to describe the width and height of the imageBitamp prior to any cropping or aspect ratio adjustments.
-
-
Use frameInit to initialize the remaining frame internal slots:
-
If
frameInit.cropLeft
is present, assign it to[[crop left]]
. Otherwise, assign0
to[[crop left]]
. -
If
frameInit.cropTop
is present, assign it to[[crop top]]
. Otherwise, assign0
to[[crop top]]
-
If
frameInit.cropWidth
is present, assign it to[[crop width]]
. Otherwise, assign[[coded width]]
to[[crop width]]
. -
If
frameInit.cropHeight
is present, assign it to[[crop height]]
. Otherwise, assign[[coded height]]
to[[crop height]]
. -
If
frameInit.displayWidth
is present, assign it to[[display width]]
. Otherwise, assign[[crop width]]
to[[display width]]
. -
If
frameInit.displayHeight
is present, assign it to[[display height]]
. Otherwise, assign[[crop height]]
to[[display height]]
. -
If
frameInit.duration
is present, assign it to[[duration]]
. Otherwise, assignnull
to[[duration]]
. -
If
frameInit.timestamp
is present, assign it to[[timestamp]]
. Otherwise, assignnull
to[[timestamp]]
.
-
-
Return frame.
VideoFrame(pixelFormat, planes, frameInit)
-
If either
codedWidth
orcodedHeight
is not present in frameInit, throw aTypeError
. -
If frameInit is not a valid VideoFrameInit, throw a
TypeError
. -
If the length of planes is incompatible with the given pixelFormat, throw a
TypeError
. -
Let frame be a new
VideoFrame
object. -
Assign
false
to frame’s[[detached]]
internal slot. -
Assign
init.format
to frame’s[[format]]
. -
For each element p in planes:
-
If p is a
Plane
, append a copy of p to frame’s[[planes]]
and continue. -
If p is a
PlaneInit
, append a newPlane
q to frame’s[[planes]]
, initialized as follows:-
Assign a copy of
p.src
to q’s[[plane buffer]]
internal slot.NOTE: the samples should be copied exactly, but the user agent may add row padding as needed to improve memory alignment.
-
Assign the width of each row in [[plane buffer]], including any padding, to
q.stride
. -
Assign
p.rows
toq.rows
. -
Assign the product of (
q.rows
*q.stride)
toq.length
-
-
-
Assign
frameInit.codedWidth
to frame’s[[coded width]]
. -
Assign
frameInit.codedHeight
to frame’s[[coded height]]
. -
If
frameInit.cropLeft
is present, assign it frame’s[[crop left]]
. Otherwise, assign0
to[[crop left]]
. -
If
frameInit.cropTop
is present, assign it to frame’s[[crop top]]
. Otherwise, assign0
to[[crop top]]
. -
If
frameInit.cropWidth
is present, assign it to frame’s[[crop width]]
. Otherwise, assign[[coded width]]
to[[crop width]]
. -
If
frameInit.cropHeight
is present, assign it to frame’s[[crop height]]
. Otherwise, assign[[coded height]]
to[[crop height]]
. -
If
frameInit.displayWidth
is present, assign it to frame’s[[display width]]
. Otherwise, assign[[crop width]]
to[[display width]]
. -
If
frameInit.displayHeight
is present, assign it to frame’s[[display height]]
. Otherwise, assign[[crop height]]
to[[display height]]
. -
If
frameInit.duration
is present, assign it to[[duration]]
. Otherwise, assignnull
to[[duration]]
. -
If
frameInit.timestamp
is present, assign it to[[timestamp]]
. Otherwise, assignnull
to[[timestamp]]
. -
Return frame.
9.2.3. Attributes
format
, of type PixelFormat, readonly-
Describes the arrangement of bytes in each plane as well as the number and order of the planes.
The
format
getter steps are to return[[format]]
. planes
, of type FrozenArray<Plane>, readonly-
Holds pixel data data, laid out as described by format and Plane attributes.
The
planes
getter steps are to return[[planes]]
. codedWidth
, of type unsigned long, readonly-
Width of the
VideoFrame
in pixels, prior to any cropping or aspect ratio adjustments.The
codedWidth
getter steps are to return[[coded width]]
. codedHeight
, of type unsigned long, readonly-
Height of the VideoFrame in pixels, prior to any cropping or aspect ratio adjustments.
The
codedHeight
getter steps are to return[[coded height]]
. cropLeft
, of type unsigned long, readonly-
The number of pixels to remove from the left of the VideoFrame, prior to aspect ratio adjustments.
The
cropLeft
getter steps are to return[[crop left]]
. cropTop
, of type unsigned long, readonly-
The number of pixels to remove from the top of the VideoFrame, prior to aspect ratio adjustments.
The
cropTop
getter steps are to return[[crop top]]
. cropWidth
, of type unsigned long, readonly-
The width of pixels to include in the crop, starting from cropLeft.
The
cropWidth
getter steps are to return[[crop width]]
. cropHeight
, of type unsigned long, readonly-
The height of pixels to include in the crop, starting from cropLeft.
The
cropHeight
getter steps are to return[[crop height]]
. displayWidth
, of type unsigned long, readonly-
Width of the VideoFrame when displayed after applying aspect ratio adjustments.
The
displayWidth
getter steps are to return[[display width]]
. displayHeight
, of type unsigned long, readonly-
Height of the VideoFrame when displayed after applying aspect ratio adjustments.
The
displayHeight
getter steps are to return[[display height]]
. timestamp
, of type unsigned long long, readonly, nullable-
The presentation timestamp, given in microseconds. The timestamp is copied from the
EncodedVideoChunk
corresponding to this VideoFrame.The
timestamp
getter steps are to return[[timestamp]]
. duration
, of type unsigned long long, readonly, nullable-
The presentation duration, given in microseconds. The duration is copied from the
EncodedVideoChunk
corresponding to this VideoFrame.The
duration
getter steps are to return[[duration]]
.
9.2.4. Methods
destroy()
Immediately frees system resources. Destruction applies to all
references, including references that are serialized and passed across
Realms.
NOTE: Authors should take care to manage frame lifetimes by calling destroy()
immediately when frames are no longer needed.
NOTE: Use clone() to create a deep copy. Cloned frames have their own lifetime and will not be affected by destroying the original frame.
When invoked, run these steps:
-
If
[[detached]]
istrue
, throw anInvalidStateError
. -
Remove all
Plane
s from[[planes]]
and release associated memory. -
Assign
true
to the[[detached]]
internal slot.
clone()
Creates a new VideoFrame
with a separate lifetime containing a deep copy of
this frame’s resources.
NOTE: VideoFrames may require a large amount of memory. Use clone()
sparingly.
When invoked, run the following steps:
-
If the value of the
[[detached]]
slot istrue
, return a promise rejected withInvalidStateError
DOMException
. -
Let p be a new Promise.
-
In parallel, resolve p with the result of running the Clone Frame algorithm with this.
-
Return p.
createImageBitmap(options)
Creates an ImageBitmap from this VideoFrame
.
When invoked, run these steps:
-
Let p be a new Promise.
-
If either options’s
resizeWidth
orresizeHeight
is present and is 0, then return p rejected with anInvalidStateError
DOMException
. -
If the this'
[[detached]]
internal slot is set totrue
, then return p rejected with anInvalidStateError
DOMException
. -
Let imageBitmap be a new
ImageBitmap
object. -
Set imageBitmap’s bitmap data to a copy of the
VideoFrame
pixel data, at the frame’s intrinsic width and intrinsic height (i.e
., after any aspect-ratio correction has been applied), cropped to the source rectangle with formatting. -
If the origin of imageBitmap’s image is not same origin with entry settings object’s origin, then set the origin-clean flag of imageBitmap’s bitmap to
false
. -
Run this step in parallel:
-
Resolve p with imageBitmap.
9.2.5. Algorithms
To check if aVideoFrameInit
is a valid VideoFrameInit,
run these steps:
-
If
codedWidth
= 0 orcodedHeight
= 0, returnfalse
. -
If
cropWidth
= 0 orcropHeight
= 0, returnfalse
. -
If
cropTop
+cropHeight
>=codedHeight
, returnfalse
. -
If
cropLeft
+cropWidth
>=codedWidth
, returnfalse
. -
If
displayWidth
= 0 ordisplayHeight
= 0, returnfalse
. -
Return
true
.
9.3. Plane Interface
APlane
acts like a thin wrapper around an ArrayBuffer
, but may actually
be backed by a texture. Plane
s hide any padding before the first sample
or after the last row.
A Plane
is solely constructed by its VideoFrame
. During construction,
the User Agent may use knowledge of the frame’s PixelFormat
to add
padding to the Plane
to improve memory alignment.
A Plane
cannot be used after the VideoFrame
is destroyed. A new VideoFrame
can be assembled from existing Plane
s, and the new VideoFrame
will remain valid when the original is destroyed. This makes
it possible to efficiently add an alpha plane to an existing VideoFrame
.
[Exposed =(Window ,DedicatedWorker )]interface {
Plane readonly attribute unsigned long stride ;readonly attribute unsigned long rows ;readonly attribute unsigned long length ;undefined readInto (ArrayBufferView ); };
dst dictionary {
PlaneInit required BufferSource ;
src required unsigned long ;
stride required unsigned long ; };
rows
9.3.1. Internal Slots
[[parent frame]]
- Refers to the
VideoFrame
that constructed and owns this plane. [[plane buffer]]
- Internal storage for the plane’s pixel data.
9.3.2. Attributes
stride
, of type unsigned long, readonly- The width of each row including any padding.
rows
, of type unsigned long, readonly- The number of rows.
length
, of type unsigned long, readonly- The total byte length of the plane (stride * rows).
9.3.3. Methods
readInto(dst)
Copies the plane data into dst.
When invoked, run these steps:
-
If
[[parent frame]]
has been destroyed, throw anInvalidStateError
. -
If
length
is greater than |dst.byteLength
|, throw aTypeError
. -
Copy the
[[plane buffer]]
into dst.
9.4. Pixel Format
Pixel formats describe the arrangement of bytes in each plane as well as the number and order of the planes.NOTE: This section needs work. We expect to add more pixel formats and offer much more verbose definitions. For now, please see http://www.fourcc.org/pixel-format/yuv-i420/ for a more complete description.
enum {
PixelFormat "I420" };
I420
- Planar 4:2:0 YUV.
9.5. Algorithms
- Clone Frame (with frame)
-
-
Let cloneFrame be a new object of the same type as frame (either
AudioFrame
orVideoFrame
). -
Initialize each attribute and internal slot of clone with a copy of the value from the corresponding attribute of this frame.
NOTE: User Agents are encouraged to avoid expensive copies of large objects (for instance,
VideoFrame
pixel data). Frame types are immutable, so the above step may be implemented using memory sharing techniques such as reference counting. -
Return cloneFrame.
-
10. Security Considerations
The primary security impact is that features of this API make it easier for an attacker to exploit vulnerabilities in the underlying platform codecs. Additionally, new abilities to configure and control the codecs may allow for new exploits that rely on a specific configuration and/or sequence of control operations.
Platform codecs are historically an internal detail of APIs like HTMLMediaElement
, [WebAudio], and [WebRTC]. In this way, it has always
been possible to attack the underlying codecs by using malformed media
files/streams and invoking the various API control methods.
For example, you can send any stream to a decoder by first wrapping that stream
in a media container (e.g. mp4) and setting that as the src
of an HTMLMediaElement
. You can then cause the underlying video decoder to
be reset()
by setting a new value for <video>.currentTime
.
WebCodecs makes such attacks easier by exposing low level control when inputs are provided and direct access to invoke the codec control methods. This also affords attackers the ability to invoke sequences of control methods that were not previously possible via the higher level APIs.
User agents should mitigate this risk by extensively fuzzing their implementation with random inputs and control method invocations. Additionally, user agents are encouraged to isolate their underlying codecs in processes with restricted privileges (sandbox) as a barrier against successful exploits being able to read user data.
An additional concern is exposing the underlying codecs to input mutation race conditions. Specifically, it should not be possible for a site to mutate a codec input or output while the underlying codec may still be operating on that data. This concern is mitigated by ensuring that input and output interfaces are immutable.
EncodedVideoChunk and EncodedAudioChunk currently expose a mutable data. See #80.
11. Privacy Considerations
The primary privacy impact is an increased ability to fingerprint users by querying for different codec capabilities to establish a codec feature profile. Much of this profile is already exposed by existing APIs. Such profiles are very unlikely to be uniquely identifying, but may be used with other metrics to create a fingerprint.An attacker may accumulate a codec feature profile by calling IsConfigSupported()
methods with a number of different configuration
dictionaries. Similarly, an attacker may attempt to configure()
a codec with
different configuration dictionaries and observe which configurations are
accepted.
Attackers may also use existing APIs to establish much of the codec feature
profile. For example, the [media-capabilities] decodingInfo()
API
describes what types of decoders are supported and its powerEfficient
attribute may signal when a decoder uses hardware acceleration. Similarly, the [WebRTC] getCapabilities()
API may be used to determine what
types of encoders are supported and the getStats()
API may
be used to determine when an encoder uses hardware acceleration. WebCodecs will
expose some additional information in the form of low level codec features.
A codec feature profile alone is unlikely to be uniquely identifying. Underlying codecs are often implemented entirely in software (be it part of the user agent binary or part of the operating system), such that all users who run that software will have a common set capabilities. Additionally, underlying codecs are often implemented with hardware acceleration, but such hardware is mass produced and devices of a particular class and manufacture date (e.g. flagship phones manufactured in 2020) will often have common capabilities. There will be outliers (some users may run outdated versions of software codecs or use a rare mix of custom assembled hardware), but most of the time a given codec feature profile is shared by a large group of users.
Segmenting groups of users by codec feature profile still amounts to a bit of entropy that can be combined with other metrics to uniquely identify a user. User agents may partially mitigate this by returning an error whenever a site attempts to exhaustively probe for codec capabilities. Additionally, user agents may implement a "privacy budget", which depletes as authors use WebCodecs and other identifying APIs. Upon exhaustion of the privacy budget, codec capabilities could be reduced to a common baseline or prompt for user approval.