trackIdentifier of type DOMString
The value of the
kind of type DOMString
The value of the
This is either
mid of type DOMString
RTCRtpTransceiver owning this stream has a
mid value that is not null, this is that
value, otherwise this member is not present.
remoteId of type DOMString
remoteId is used for looking up the remote
RTCRemoteOutboundRtpStreamStats object for the same SSRC.
Only exists for video. It represents the total number of frames correctly decoded
for this RTP stream, i.e., frames that would be displayed if no frames are dropped.
keyFramesDecoded of type unsigned long
Only exists for video. It represents the total number of key frames, such as key
frames in VP8 [RFC6386] or IDR-frames in H.264 [RFC6184], successfully
decoded for this RTP media stream. This is a subset of
framesDecoded - keyFramesDecoded gives
you the number of delta frames decoded.
frameWidth of type unsigned
Only exists for video. Represents the width of the last decoded frame. Before the
first frame is decoded this member does not exist.
frameHeight of type unsigned
Only exists for video. Represents the height of the last decoded frame. Before
the first frame is decoded this member does not exist.
framesPerSecond of type double
Only exists for video. The number of decoded frames in the last second.
qpSum of type unsigned long
Only exists for video. The sum of the QP values of frames decoded by this
receiver. The count of frames is in
The definition of QP value depends on the codec; for VP8, the QP value is the
value carried in the frame header as the syntax element
y_ac_qi, and defined in
[RFC6386] section 19.2. Its range is 0..127.
Note that the QP value is only an indication of quantizer values used; many
formats have ways to vary the quantizer value within the frame.
totalDecodeTime of type double
Total number of seconds that have been spent decoding the
frames of this stream. The average decode time can be calculated by dividing this
framesDecoded. The time it takes to decode one frame is the
time passed between feeding the decoder a frame and the decoder returning decoded
data for that frame.
totalInterFrameDelay of type double
Sum of the interframe delays in seconds between consecutively decoded frames,
recorded just after a frame has been decoded. The interframe delay variance be
framesDecoded according to the formula:
totalSquaredInterFrameDelay of type double
Sum of the squared interframe delays in seconds between consecutively decoded frames,
recorded just after a frame has been decoded. See
details on how to calculate the interframe delay variance.
lastPacketReceivedTimestamp of type DOMHighResTimeStamp
Represents the timestamp at which the last packet was received for this SSRC.
This differs from
timestamp, which represents the time at which the
statistics were generated by the local endpoint.
of type unsigned long long
Total number of RTP header and padding bytes received for this SSRC. This does
not include the size of transport layer headers such as IP or UDP.
headerBytesReceived + bytesReceived equals the number of bytes
received as payload over the transport.
packetsDiscarded of type unsigned long long
The cumulative number of RTP packets discarded by the jitter buffer due to late
or early-arrival, i.e., these packets are not played out. RTP packets discarded
due to packet duplication are not reported in this metric [XRBLOCK-STATS].
Calculated as defined in [RFC7002] section 3.2 and Appendix A.a.
fecPacketsReceived of type unsigned long long
Total number of RTP FEC packets received for this SSRC. This counter can also be
incremented when receiving FEC packets in-band with media packets (e.g., with
fecPacketsDiscarded of type unsigned long long
Total number of RTP FEC packets received for this SSRC where the error correction
payload was discarded by the application. This may happen 1. if all the source
packets protected by the FEC packet were received or already recovered by a
separate FEC packet, or 2. if the FEC packet arrived late, i.e., outside the
recovery window, and the lost RTP packets have already been skipped during
playout. This is a subset of
bytesReceived of type unsigned
Total number of bytes received for this SSRC. Calculated as defined in
[RFC3550] section 6.4.1.
firCount of type unsigned
Only exists for video. Count the total number of Full Intra Request (FIR) packets
sent by this receiver. Calculated as defined in [RFC5104] section 4.3.1. and
does not use the metric indicated in [RFC2032], because it was deprecated by
pliCount of type unsigned
Only exists for video. Count the total number of Picture Loss Indication (PLI)
packets sent by this receiver. Calculated as defined in [RFC4585] section
totalProcessingDelay of type double
It is the sum of the time, in seconds, each audio sample or video frame takes from
the time the first RTP packet is received (reception timestamp) and to the time
the corresponding sample or frame is decoded (decoded timestamp). At this point the audio
sample or video frame is ready for playout by the MediaStreamTrack. Typically ready for
playout here means after the audio sample or video frame is fully decoded by the decoder.
Given the complexities involved, the time of arrival or the reception timestamp is measured
as close to the network layer as possible and the decoded timestamp is measured as soon as the
complete sample or frame is decoded.
In the case of audio, several samples are received in the same RTP packet, all samples
will share the same reception timestamp and different decoded timestamps.
In the case of video, the frame is received over several RTP packets, in this
case the earliest timestamp containing the frame is counted as the reception timestamp,
and the decoded timestamp corresponds to when the complete frame is decoded.
This metric is not incremented for frames that are not decoded,
The average processing delay can be calculated by dividing the
totalProcessingDelay with the
framesDecoded for video (or povisional stats spec
totalSamplesDecoded for audio).
nackCount of type unsigned
Count the total number of Negative ACKnowledgement (NACK) packets sent by this
receiver. Calculated as defined in [RFC4585] section 6.2.1.
estimatedPlayoutTimestamp of type DOMHighResTimeStamp
This is the estimated playout time of this receiver's track. The playout time is
the NTP timestamp of the last playable audio sample or video frame that has a known
timestamp (from an RTCP SR packet mapping RTP timestamps to NTP timestamps),
extrapolated with the time elapsed since it was ready to be played out. This is
the "current time" of the track in NTP clock time of the sender and can be present
even if there is no audio currently playing.
This can be useful for estimating how much audio and video is out of sync for two
tracks from the same source, audioTrackStats.
jitterBufferDelay of type double
The purpose of the jitter buffer is to recombine RTP packets into frames (in the case of video)
and have smooth playout. The model described here assumes that the samples or frames are
still compressed and have not yet been decoded.
It is the sum of the time, in seconds, each audio sample or a video frame takes from
the time the first packet is received by the jitter buffer (ingest timestamp) to the
time it exits the jitter buffer (emit timestamp).
In the case of audio, several samples belong to the same RTP packet, hence they will have the same
ingest timestamp but different jitter buffer emit timestamps.
In the case of video, the frame maybe is received over several RTP packets, hence the ingest timestamp
is the earliest packet of the frame that entered the jitter buffer and the emit timestamp is
when the whole frame exits the jitter buffer.
This metric increases upon samples or frames exiting, having completed their time in the buffer (and
jitterBufferEmittedCount). The average jitter buffer
delay can be calculated by dividing the
jitterBufferDelay with the
jitterBufferTargetDelay of type double
This value is increased by the target jitter buffer delay every time a
sample is emitted by the jitter buffer. The added target is the target
delay, in seconds, at the time that the sample was emitted from the
jitter buffer. To get the average target delay, divide by
jitterBufferEmittedCount of type unsigned long long
The total number of audio samples or video frames that have come out of the
jitter buffer (increasing
jitterBufferMinimumDelay of type double
There are various reasons why the jitter buffer delay might be increased to a higher value, such as
to achieve AV synchronization or because a
was set on a RTCRtpReceiver. When using one of these mechanisms, it can be useful to keep track of
the minimal jitter buffer delay that could have been achieved, so WebRTC clients can track the amount
of additional delay that is being added.
This metric works the same way as
jitterBufferTargetDelay, except that it is not affected by
external mechanisms that increase the jitter buffer target delay, such as playoutDelay (see link above),
AV sync, or any other mechanisms. This metric is purely based on the network characteristics such
as jitter and packet loss, and can be seen as the minimum obtainable jitter buffer delay if no
external factors would affect it. The metric is updated every time
jitterBufferEmittedCount is updated.
totalSamplesReceived of type unsigned long long
Only exists for audio. The total number of samples that have been received on this
RTP stream. This includes
concealedSamples of type unsigned long long
Only exists for audio. The total number of samples that are concealed samples. A
concealed sample is a sample that was replaced with synthesized samples generated
locally before being played out. Examples of samples that have to be concealed
are samples from lost packets (reported in
packetsLost) or samples from packets that arrive
too late to be played out (reported in
silentConcealedSamples of type unsigned long long
Only exists for audio. The total number of concealed samples inserted that are
"silent". Playing out silent samples results in silence or comfort noise. This is
a subset of
concealmentEvents of type unsigned long long
Only exists for audio. The number of concealment events. This counter increases every
time a concealed sample is synthesized after a non-concealed sample. That is, multiple
consecutive concealed samples will increase the
concealedSamples count multiple
times but is a single concealment event.
insertedSamplesForDeceleration of type unsigned long long
Only exists for audio. When playout is slowed down, this counter is increased by the
difference between the number of samples received and the number of samples played out.
If playout is slowed down by inserting samples, this will be the number of inserted
removedSamplesForAcceleration of type unsigned long long
Only exists for audio. When playout is sped up, this counter is increased by the
difference between the number of samples received and the number of samples played
out. If speedup is achieved by removing samples, this will be the count of samples
audioLevel of type double
Only exists for audio. Represents the audio level of the receiving track. For audio
levels of tracks attached locally, see
The value is between 0..1 (linear), where 1.0 represents 0 dBov, 0 represents
silence, and 0.5 represents approximately 6 dBSPL change in the sound pressure
level from 0 dBov.
audioLevel is averaged over some small interval, using the algorithm
totalAudioEnergy. The interval used is implementation
totalAudioEnergy of type double
Only exists for audio. Represents the audio energy of the receiving track. For
audio energy of tracks attached locally, see
This value MUST be computed as follows: for each audio sample that is received
(and thus counted by
totalSamplesReceived), add the sample's
value divided by the highest-intensity encodable value, squared and then
multiplied by the duration of the sample in seconds. In other words,
duration * Math.pow(energy/maxEnergy, 2).
This can be used to obtain a root mean square (RMS) value that uses the same
audioLevel, as defined in [RFC6464]. It can be
converted to these units using the formula
Math.sqrt(totalAudioEnergy/totalSamplesDuration). This calculation
can also be performed using the differences between the values of two different
() calls, in order to compute the average audio level over
any desired time interval. In other words, do
energy1)/(duration2 - duration1)).
For example, if a 10ms packet of audio is produced with an RMS of 0.5 (out of
1.0), this should add
0.5 * 0.5 * 0.01 = 0.0025 to
totalAudioEnergy. If another 10ms packet with an RMS of 0.1 is
received, this should similarly add
Math.sqrt(0.0026/0.02) = 0.36, which is the same value that would be
obtained by doing an RMS calculation over the contiguous 20ms segment of audio.
If multiple audio channels are used, the
audio energy of a sample refers to the highest energy of any
totalSamplesDuration of type double
Only exists for audio. Represents the audio duration of the receiving track. For
audio durations of tracks attached locally, see
Represents the total duration in seconds of all samples that have been received
(and thus counted by
totalSamplesReceived). Can be used with
totalAudioEnergy to compute an average audio level over
framesReceived of type unsigned
Only exists for video. Represents the total number of complete frames received on
this RTP stream. This metric is incremented when the complete frame is received.
decoderImplementation of type DOMString
Identifies the decoder implementation used. This is useful for diagnosing
If too much information is given here, it increases the fingerprint surface.
Since it is only given for active tracks, the incremental exposure is small.
playoutId of type DOMString
If audio playout is happening, this is used to look up the