-
trackIdentifier
of type DOMString
-
The value of the MediaStreamTrack
's id
attribute.
-
kind
of type DOMString
-
The value of the MediaStreamTrack
's kind
attribute.
This is either "audio"
or "video"
.
-
mid
of type DOMString
-
If the RTCRtpTransceiver
owning this stream has a
mid
value that is not null, this is that
value, otherwise this member is not present.
-
remoteId
of type DOMString
-
The remoteId
is used for looking up the remote
RTCRemoteOutboundRtpStreamStats
object for the same SSRC.
-
framesDecoded
-
Only exists for video. It represents the total number of frames correctly decoded
for this RTP stream, i.e., frames that would be displayed if no frames are dropped.
-
keyFramesDecoded
of type unsigned long
-
Only exists for video. It represents the total number of key frames, such as key
frames in VP8 [RFC6386] or IDR-frames in H.264 [RFC6184], successfully
decoded for this RTP media stream. This is a subset of
framesDecoded
. framesDecoded - keyFramesDecoded
gives
you the number of delta frames decoded.
-
framesRendered
-
Only exists for video. It represents the total number of frames that have been
rendered. It is incremented just after a frame has been rendered.
-
framesDropped
of type unsigned
long
-
Only exists for video. The total number of frames dropped prior to decode or dropped
because the frame missed its display deadline for this receiver's track. The measurement
begins when the receiver is created and is a cumulative metric as defined in
Appendix A (g) of [RFC7004].
-
frameWidth
of type unsigned
long
-
Only exists for video. Represents the width of the last decoded frame. Before the
first frame is decoded this member does not exist.
-
frameHeight
of type unsigned
long
-
Only exists for video. Represents the height of the last decoded frame. Before
the first frame is decoded this member does not exist.
-
framesPerSecond
of type double
-
Only exists for video. The number of decoded frames in the last second.
-
qpSum
of type unsigned long
long
-
Only exists for video. The sum of the QP values of frames decoded by this
receiver. The count of frames is in framesDecoded
.
The definition of QP value depends on the codec; for VP8, the QP value is the
value carried in the frame header as the syntax element y_ac_qi
, and defined in
[RFC6386] section 19.2. Its range is 0..127.
Note that the QP value is only an indication of quantizer values used; many
formats have ways to vary the quantizer value within the frame.
-
totalDecodeTime
of type double
-
Total number of seconds that have been spent decoding the framesDecoded
frames of this stream. The average decode time can be calculated by dividing this
value with framesDecoded
. The time it takes to decode one frame is the
time passed between feeding the decoder a frame and the decoder returning decoded
data for that frame.
-
totalInterFrameDelay
of type double
-
Sum of the interframe delays in seconds between consecutively rendered frames,
recorded just after a frame has been rendered. The interframe delay variance be
calculated from totalInterFrameDelay
, totalSquaredInterFrameDelay
,
and framesRendered
according to the formula:
(totalSquaredInterFrameDelay
- totalInterFrameDelay
^2/
framesRendered
)/framesRendered
.
-
totalSquaredInterFrameDelay
of type double
-
Sum of the squared interframe delays in seconds between consecutively rendered frames,
recorded just after a frame has been rendered. See totalInterFrameDelay
for
details on how to calculate the interframe delay variance.
-
pauseCount
of type
unsigned long
-
Count the total number of video pauses experienced by this receiver.
Video is considered to be paused if time passed since last rendered
frame exceeds 5 seconds. pauseCount
is incremented when a frame
is rendered after such a pause.
-
totalPausesDuration
of type
double
-
Total duration of pauses (for definition of pause see pauseCount
),
in seconds. This value is updated when a frame is rendered.
-
freezeCount
of type
unsigned long
-
Count the total number of video freezes experienced by this receiver.
It is a freeze if frame duration, which is time interval between two
consecutively rendered frames, is equal or exceeds
Max(3 * avg_frame_duration_ms, avg_frame_duration_ms + 150), where
avg_frame_duration_ms is linear average of durations of last 30 rendered
frames.
-
totalFreezesDuration
of type
double
-
Total duration of rendered frames which are considered as frozen (for
definition of freeze see freezeCount
), in seconds. This value
is updated when a frame is rendered.
-
lastPacketReceivedTimestamp
of type DOMHighResTimeStamp
-
Represents the timestamp at which the last packet was received for this SSRC.
This differs from timestamp
, which represents the time at which the
statistics were generated by the local endpoint.
-
of type unsigned long long
-
Total number of RTP header and padding bytes received for this SSRC. This does
not include the size of transport layer headers such as IP or UDP.
headerBytesReceived + bytesReceived
equals the number of bytes
received as payload over the transport.
-
packetsDiscarded
of type unsigned long long
-
The cumulative number of RTP packets discarded by the jitter buffer due to late
or early-arrival, i.e., these packets are not played out. RTP packets discarded
due to packet duplication are not reported in this metric [XRBLOCK-STATS].
Calculated as defined in [RFC7002] section 3.2 and Appendix A.a.
-
fecPacketsReceived
of type unsigned long long
-
Total number of RTP FEC packets received for this SSRC. This counter can also be
incremented when receiving FEC packets in-band with media packets (e.g., with
Opus).
-
fecPacketsDiscarded
of type unsigned long long
-
Total number of RTP FEC packets received for this SSRC where the error correction
payload was discarded by the application. This may happen 1. if all the source
packets protected by the FEC packet were received or already recovered by a
separate FEC packet, or 2. if the FEC packet arrived late, i.e., outside the
recovery window, and the lost RTP packets have already been skipped during
playout. This is a subset of fecPacketsReceived
.
-
bytesReceived
of type unsigned
long long
-
Total number of bytes received for this SSRC. Calculated as defined in
[RFC3550] section 6.4.1.
-
firCount
of type unsigned
long
-
Only exists for video. Count the total number of Full Intra Request (FIR) packets,
as defined in [RFC5104] section 4.3.1, sent by this receiver. Does not count the RTCP FIR
indicated in [RFC2032] which was deprecated by [RFC4587].
-
pliCount
of type unsigned
long
-
Only exists for video. Count the total number of Picture Loss Indication (PLI)
packets, as defined in [RFC4585] section 6.3.1, sent by this receiver.
-
totalProcessingDelay
of type double
-
It is the sum of the time, in seconds, each audio sample or video frame takes from
the time the first RTP packet is received (reception timestamp) and to the time
the corresponding sample or frame is decoded (decoded timestamp). At this point the audio
sample or video frame is ready for playout by the MediaStreamTrack. Typically ready for
playout here means after the audio sample or video frame is fully decoded by the decoder.
Given the complexities involved, the time of arrival or the reception timestamp is measured
as close to the network layer as possible and the decoded timestamp is measured as soon as the
complete sample or frame is decoded.
In the case of audio, several samples are received in the same RTP packet, all samples
will share the same reception timestamp and different decoded timestamps.
In the case of video, the frame is received over several RTP packets, in this
case the earliest timestamp containing the frame is counted as the reception timestamp,
and the decoded timestamp corresponds to when the complete frame is decoded.
This metric is not incremented for frames that are not decoded,
i.e. framesDropped
.
The average processing delay can be calculated by dividing the totalProcessingDelay
with the
framesDecoded
for video (or povisional stats spec totalSamplesDecoded
for audio).
-
nackCount
of type unsigned
long
-
Count the total number of Negative ACKnowledgement (NACK) packets, as defined in [RFC4585]
section 6.2.1, sent by this receiver.
-
estimatedPlayoutTimestamp
of type DOMHighResTimeStamp
-
This is the estimated playout time of this receiver's track. The playout time is
the NTP timestamp of the last playable audio sample or video frame that has a known
timestamp (from an RTCP SR packet mapping RTP timestamps to NTP timestamps),
extrapolated with the time elapsed since it was ready to be played out. This is
the "current time" of the track in NTP clock time of the sender and can be present
even if there is no audio currently playing.
This can be useful for estimating how much audio and video is out of sync for two
tracks from the same source, audioTrackStats.estimatedPlayoutTimestamp
-
videoTrackStats.estimatedPlayoutTimestamp
.
-
jitterBufferDelay
of type double
-
The purpose of the jitter buffer is to recombine RTP packets into frames (in the case of video)
and have smooth playout. The model described here assumes that the samples or frames are
still compressed and have not yet been decoded.
It is the sum of the time, in seconds, each audio sample or a video frame takes from
the time the first packet is received by the jitter buffer (ingest timestamp) to the
time it exits the jitter buffer (emit timestamp).
In the case of audio, several samples belong to the same RTP packet, hence they will have the same
ingest timestamp but different jitter buffer emit timestamps.
In the case of video, the frame maybe is received over several RTP packets, hence the ingest timestamp
is the earliest packet of the frame that entered the jitter buffer and the emit timestamp is
when the whole frame exits the jitter buffer.
This metric increases upon samples or frames exiting, having completed their time in the buffer (and
incrementing jitterBufferEmittedCount
). The average jitter buffer
delay can be calculated by dividing the jitterBufferDelay
with the
jitterBufferEmittedCount
.
-
jitterBufferTargetDelay
of type double
-
This value is increased by the target jitter buffer delay every time a
sample is emitted by the jitter buffer. The added target is the target
delay, in seconds, at the time that the sample was emitted from the
jitter buffer. To get the average target delay, divide by
jitterBufferEmittedCount
.
-
jitterBufferEmittedCount
of type unsigned long long
-
The total number of audio samples or video frames that have come out of the
jitter buffer (increasing jitterBufferDelay
).
-
jitterBufferMinimumDelay
of type double
-
There are various reasons why the jitter buffer delay might be increased to a higher value, such as
to achieve AV synchronization or because a
playoutDelay
was set on a RTCRtpReceiver. When using one of these mechanisms, it can be useful to keep track of
the minimal jitter buffer delay that could have been achieved, so WebRTC clients can track the amount
of additional delay that is being added.
This metric works the same way as jitterBufferTargetDelay
, except that it is not affected by
external mechanisms that increase the jitter buffer target delay, such as playoutDelay (see link above),
AV sync, or any other mechanisms. This metric is purely based on the network characteristics such
as jitter and packet loss, and can be seen as the minimum obtainable jitter buffer delay if no
external factors would affect it. The metric is updated every time jitterBufferEmittedCount
is updated.
-
totalSamplesReceived
of type unsigned long long
-
Only exists for audio. The total number of samples that have been received on this
RTP stream. This includes concealedSamples
.
-
concealedSamples
of type unsigned long long
-
Only exists for audio. The total number of samples that are concealed samples. A
concealed sample is a sample that was replaced with synthesized samples generated
locally before being played out. Examples of samples that have to be concealed
are samples from lost packets (reported in packetsLost
) or samples from packets that arrive
too late to be played out (reported in packetsDiscarded
).
-
silentConcealedSamples
of type unsigned long long
-
Only exists for audio. The total number of concealed samples inserted that are
"silent". Playing out silent samples results in silence or comfort noise. This is
a subset of concealedSamples
.
-
concealmentEvents
of type unsigned long long
-
Only exists for audio. The number of concealment events. This counter increases every
time a concealed sample is synthesized after a non-concealed sample. That is, multiple
consecutive concealed samples will increase the concealedSamples
count multiple
times but is a single concealment event.
-
insertedSamplesForDeceleration
of type unsigned long long
-
Only exists for audio. When playout is slowed down, this counter is increased by the
difference between the number of samples received and the number of samples played out.
If playout is slowed down by inserting samples, this will be the number of inserted
samples.
-
removedSamplesForAcceleration
of type unsigned long long
-
Only exists for audio. When playout is sped up, this counter is increased by the
difference between the number of samples received and the number of samples played
out. If speedup is achieved by removing samples, this will be the count of samples
removed.
-
audioLevel
of type double
-
Only exists for audio. Represents the audio level of the receiving track. For audio
levels of tracks attached locally, see RTCAudioSourceStats
instead.
The value is between 0..1 (linear), where 1.0 represents 0 dBov, 0 represents
silence, and 0.5 represents approximately 6 dBSPL change in the sound pressure
level from 0 dBov.
The audioLevel
is averaged over some small interval, using the algorithm
described under totalAudioEnergy
. The interval used is implementation
dependent.
-
totalAudioEnergy
of type double
-
Only exists for audio. Represents the audio energy of the receiving track. For
audio energy of tracks attached locally, see
RTCAudioSourceStats
instead.
This value MUST be computed as follows: for each audio sample that is received
(and thus counted by totalSamplesReceived
), add the sample's
value divided by the highest-intensity encodable value, squared and then
multiplied by the duration of the sample in seconds. In other words,
duration * Math.pow(energy/maxEnergy, 2)
.
This can be used to obtain a root mean square (RMS) value that uses the same
units as audioLevel
, as defined in [RFC6464]. It can be
converted to these units using the formula
Math.sqrt(totalAudioEnergy/totalSamplesDuration)
. This calculation
can also be performed using the differences between the values of two different
getStats
()
calls, in order to compute the average audio level over
any desired time interval. In other words, do Math.sqrt((energy2 -
energy1)/(duration2 - duration1))
.
For example, if a 10ms packet of audio is produced with an RMS of 0.5 (out of
1.0), this should add 0.5 * 0.5 * 0.01 = 0.0025
to
totalAudioEnergy
. If another 10ms packet with an RMS of 0.1 is
received, this should similarly add 0.0001
to
totalAudioEnergy
. Then,
Math.sqrt(totalAudioEnergy/totalSamplesDuration)
becomes
Math.sqrt(0.0026/0.02) = 0.36
, which is the same value that would be
obtained by doing an RMS calculation over the contiguous 20ms segment of audio.
If multiple audio channels are used, the
audio energy of a sample refers to the highest energy of any
channel.
-
totalSamplesDuration
of type double
-
Only exists for audio. Represents the audio duration of the receiving track. For
audio durations of tracks attached locally, see
RTCAudioSourceStats
instead.
Represents the total duration in seconds of all samples that have been received
(and thus counted by totalSamplesReceived
). Can be used with
totalAudioEnergy
to compute an average audio level over
different intervals.
-
framesReceived
of type unsigned
long
-
Only exists for video. Represents the total number of complete frames received on
this RTP stream. This metric is incremented when the complete frame is received.
-
decoderImplementation
of type DOMString
-
Only defined when exposing hardware is allowed.
Only exists for video. Identifies the decoder implementation used.
This is useful for diagnosing interoperability issues.
-
playoutId
of type DOMString
-
If audio playout is happening, this is used to look up the
corresponding RTCAudioPlayoutStats
.
-
powerEfficientDecoder
of type boolean
-
Only defined when exposing hardware is allowed.
Whether the decoder currently used is considered power
efficient by the user agent. This SHOULD reflect if the
configuration results in hardware acceleration, but the user
agent MAY take other information into account when deciding if
the configuration is considered power efficient.
-
framesAssembledFromMultiplePackets
of type unsigned long
-
Only exists for video. It represents the total number of frames correctly decoded
for this RTP stream that consist of more than one RTP packet. For such frames the
totalAssemblyTime
is incremented. The average frame assembly time can be calculated by
dividing the totalAssemblyTime
with framesAssembledFromMultiplePackets
.
-
totalAssemblyTime
of type double
-
Only exists for video. The sum of the time, in seconds, each video frame takes
from the time the first RTP packet is received (reception timestamp) and to the time
the last RTP packet of a frame is received. Only incremented for frames consisting of more
than one RTP packet.
Given the complexities involved, the time of arrival or the reception timestamp is measured
as close to the network layer as possible. This metric is not incremented for frames that
are not decoded, i.e., framesDropped
or frames that fail decoding for other reasons
(if any). Only incremented for frames consisting of more than one RTP packet.