-
trackIdentifier
of type DOMString
-
The value of the MediaStreamTrack
's id
attribute.
-
kind
of type DOMString
-
The value of the MediaStreamTrack
's kind
attribute.
This is either "audio"
or "video"
.
-
mid
of type DOMString
-
If the RTCRtpTransceiver
owning this stream has a
mid
value that is not null, this is that
value, otherwise this member is not present.
-
remoteId
of type DOMString
-
The remoteId
is used for looking up the remote
RTCRemoteOutboundRtpStreamStats
object for the same SSRC.
-
framesDecoded
-
Only exists for video. It represents the total number of frames correctly decoded
for this RTP stream, i.e., frames that would be displayed if no frames are dropped.
-
keyFramesDecoded
of type unsigned long
-
Only exists for video. It represents the total number of key frames, such as key
frames in VP8 [RFC6386] or IDR-frames in H.264 [RFC6184], successfully
decoded for this RTP media stream. This is a subset of
framesDecoded
. framesDecoded - keyFramesDecoded
gives
you the number of delta frames decoded.
-
frameWidth
of type unsigned
long
-
Only exists for video. Represents the width of the last decoded frame. Before the
first frame is decoded this member does not exist.
-
frameHeight
of type unsigned
long
-
Only exists for video. Represents the height of the last decoded frame. Before
the first frame is decoded this member does not exist.
-
framesPerSecond
of type double
-
Only exists for video. The number of decoded frames in the last second.
-
qpSum
of type unsigned long
long
-
Only exists for video. The sum of the QP values of frames decoded by this
receiver. The count of frames is in framesDecoded
.
The definition of QP value depends on the codec; for VP8, the QP value is the
value carried in the frame header as the syntax element y_ac_qi
, and defined in
[RFC6386] section 19.2. Its range is 0..127.
Note that the QP value is only an indication of quantizer values used; many
formats have ways to vary the quantizer value within the frame.
-
totalDecodeTime
of type double
-
Total number of seconds that have been spent decoding the framesDecoded
frames of this stream. The average decode time can be calculated by dividing this
value with framesDecoded
. The time it takes to decode one frame is the
time passed between feeding the decoder a frame and the decoder returning decoded
data for that frame.
-
totalInterFrameDelay
of type double
-
Sum of the interframe delays in seconds between consecutively decoded frames,
recorded just after a frame has been decoded. The interframe delay variance be
calculated from totalInterFrameDelay
, totalSquaredInterFrameDelay
,
and framesDecoded
according to the formula:
(totalSquaredInterFrameDelay
- totalInterFrameDelay
^2/
framesDecoded
)/framesDecoded
.
-
totalSquaredInterFrameDelay
of type double
-
Sum of the squared interframe delays in seconds between consecutively decoded frames,
recorded just after a frame has been decoded. See totalInterFrameDelay
for
details on how to calculate the interframe delay variance.
-
lastPacketReceivedTimestamp
of type DOMHighResTimeStamp
-
Represents the timestamp at which the last packet was received for this SSRC.
This differs from timestamp
, which represents the time at which the
statistics were generated by the local endpoint.
-
of type unsigned long long
-
Total number of RTP header and padding bytes received for this SSRC. This does
not include the size of transport layer headers such as IP or UDP.
headerBytesReceived + bytesReceived
equals the number of bytes
received as payload over the transport.
-
packetsDiscarded
of type unsigned long long
-
The cumulative number of RTP packets discarded by the jitter buffer due to late
or early-arrival, i.e., these packets are not played out. RTP packets discarded
due to packet duplication are not reported in this metric [XRBLOCK-STATS].
Calculated as defined in [RFC7002] section 3.2 and Appendix A.a.
-
fecPacketsReceived
of type unsigned long long
-
Total number of RTP FEC packets received for this SSRC. This counter can also be
incremented when receiving FEC packets in-band with media packets (e.g., with
Opus).
-
fecPacketsDiscarded
of type unsigned long long
-
Total number of RTP FEC packets received for this SSRC where the error correction
payload was discarded by the application. This may happen 1. if all the source
packets protected by the FEC packet were received or already recovered by a
separate FEC packet, or 2. if the FEC packet arrived late, i.e., outside the
recovery window, and the lost RTP packets have already been skipped during
playout. This is a subset of fecPacketsReceived
.
-
bytesReceived
of type unsigned
long long
-
Total number of bytes received for this SSRC. Calculated as defined in
[RFC3550] section 6.4.1.
-
firCount
of type unsigned
long
-
Only exists for video. Count the total number of Full Intra Request (FIR) packets
sent by this receiver. Calculated as defined in [RFC5104] section 4.3.1. and
does not use the metric indicated in [RFC2032], because it was deprecated by
[RFC4587].
-
pliCount
of type unsigned
long
-
Only exists for video. Count the total number of Picture Loss Indication (PLI)
packets sent by this receiver. Calculated as defined in [RFC4585] section
6.3.1.
-
totalProcessingDelay
of type double
-
It is the sum of the time, in seconds, each audio sample or video frame takes from
the time the first RTP packet is received (reception timestamp) and to the time
the corresponding sample or frame is decoded (decoded timestamp). At this point the audio
sample or video frame is ready for playout by the MediaStreamTrack. Typically ready for
playout here means after the audio sample or video frame is fully decoded by the decoder.
Given the complexities involved, the time of arrival or the reception timestamp is measured
as close to the network layer as possible and the decoded timestamp is measured as soon as the
complete sample or frame is decoded.
In the case of audio, several samples are received in the same RTP packet, all samples
will share the same reception timestamp and different decoded timestamps.
In the case of video, the frame is received over several RTP packets, in this
case the earliest timestamp containing the frame is counted as the reception timestamp,
and the decoded timestamp corresponds to when the complete frame is decoded.
This metric is not incremented for frames that are not decoded,
i.e. framesDropped
.
The average processing delay can be calculated by dividing the totalProcessingDelay
with the
framesDecoded
for video (or povisional stats spec totalSamplesDecoded
for audio).
-
nackCount
of type unsigned
long
-
Count the total number of Negative ACKnowledgement (NACK) packets sent by this
receiver. Calculated as defined in [RFC4585] section 6.2.1.
-
estimatedPlayoutTimestamp
of type DOMHighResTimeStamp
-
This is the estimated playout time of this receiver's track. The playout time is
the NTP timestamp of the last playable audio sample or video frame that has a known
timestamp (from an RTCP SR packet mapping RTP timestamps to NTP timestamps),
extrapolated with the time elapsed since it was ready to be played out. This is
the "current time" of the track in NTP clock time of the sender and can be present
even if there is no audio currently playing.
This can be useful for estimating how much audio and video is out of sync for two
tracks from the same source, audioTrackStats.estimatedPlayoutTimestamp
-
videoTrackStats.estimatedPlayoutTimestamp
.
-
jitterBufferDelay
of type double
-
The purpose of the jitter buffer is to recombine RTP packets into frames (in the case of video)
and have smooth playout. The model described here assumes that the samples or frames are
still compressed and have not yet been decoded.
It is the sum of the time, in seconds, each audio sample or a video frame takes from
the time the first packet is received by the jitter buffer (ingest timestamp) to the
time it exits the jitter buffer (emit timestamp).
In the case of audio, several samples belong to the same RTP packet, hence they will have the same
ingest timestamp but different jitter buffer emit timestamps.
In the case of video, the frame maybe is received over several RTP packets, hence the ingest timestamp
is the earliest packet of the frame that entered the jitter buffer and the emit timestamp is
when the whole frame exits the jitter buffer.
This metric increases upon samples or frames exiting, having completed their time in the buffer (and
incrementing jitterBufferEmittedCount
). The average jitter buffer
delay can be calculated by dividing the jitterBufferDelay
with the
jitterBufferEmittedCount
.
-
jitterBufferTargetDelay
of type double
-
This value is increased by the target jitter buffer delay every time a
sample is emitted by the jitter buffer. The added target is the target
delay, in seconds, at the time that the sample was emitted from the
jitter buffer. To get the average target delay, divide by
jitterBufferEmittedCount
.
-
jitterBufferEmittedCount
of type unsigned long long
-
The total number of audio samples or video frames that have come out of the
jitter buffer (increasing jitterBufferDelay
).
-
totalSamplesReceived
of type unsigned long long
-
Only exists for audio. The total number of samples that have been received on this
RTP stream. This includes concealedSamples
.
-
concealedSamples
of type unsigned long long
-
Only exists for audio. The total number of samples that are concealed samples. A
concealed sample is a sample that was replaced with synthesized samples generated
locally before being played out. Examples of samples that have to be concealed
are samples from lost packets (reported in packetsLost
) or samples from packets that arrive
too late to be played out (reported in packetsDiscarded
).
-
silentConcealedSamples
of type unsigned long long
-
Only exists for audio. The total number of concealed samples inserted that are
"silent". Playing out silent samples results in silence or comfort noise. This is
a subset of concealedSamples
.
-
concealmentEvents
of type unsigned long long
-
Only exists for audio. The number of concealment events. This counter increases every
time a concealed sample is synthesized after a non-concealed sample. That is, multiple
consecutive concealed samples will increase the concealedSamples
count multiple
times but is a single concealment event.
-
insertedSamplesForDeceleration
of type unsigned long long
-
Only exists for audio. When playout is slowed down, this counter is increased by the
difference between the number of samples received and the number of samples played out.
If playout is slowed down by inserting samples, this will be the number of inserted
samples.
-
removedSamplesForAcceleration
of type unsigned long long
-
Only exists for audio. When playout is sped up, this counter is increased by the
difference between the number of samples received and the number of samples played
out. If speedup is achieved by removing samples, this will be the count of samples
removed.
-
audioLevel
of type double
-
Only exists for audio. Represents the audio level of the receiving track. For audio
levels of tracks attached locally, see RTCAudioSourceStats
instead.
The value is between 0..1 (linear), where 1.0 represents 0 dBov, 0 represents
silence, and 0.5 represents approximately 6 dBSPL change in the sound pressure
level from 0 dBov.
The audioLevel
is averaged over some small interval, using the algorithm
described under totalAudioEnergy
. The interval used is implementation
dependent.
-
totalAudioEnergy
of type double
-
Only exists for audio. Represents the audio energy of the receiving track. For
audio energy of tracks attached locally, see
RTCAudioSourceStats
instead.
This value MUST be computed as follows: for each audio sample that is received
(and thus counted by totalSamplesReceived
), add the sample's
value divided by the highest-intensity encodable value, squared and then
multiplied by the duration of the sample in seconds. In other words,
duration * Math.pow(energy/maxEnergy, 2)
.
This can be used to obtain a root mean square (RMS) value that uses the same
units as audioLevel
, as defined in [RFC6464]. It can be
converted to these units using the formula
Math.sqrt(totalAudioEnergy/totalSamplesDuration)
. This calculation
can also be performed using the differences between the values of two different
getStats
()
calls, in order to compute the average audio level over
any desired time interval. In other words, do Math.sqrt((energy2 -
energy1)/(duration2 - duration1))
.
For example, if a 10ms packet of audio is produced with an RMS of 0.5 (out of
1.0), this should add 0.5 * 0.5 * 0.01 = 0.0025
to
totalAudioEnergy
. If another 10ms packet with an RMS of 0.1 is
received, this should similarly add 0.0001
to
totalAudioEnergy
. Then,
Math.sqrt(totalAudioEnergy/totalSamplesDuration)
becomes
Math.sqrt(0.0026/0.02) = 0.36
, which is the same value that would be
obtained by doing an RMS calculation over the contiguous 20ms segment of audio.
If multiple audio channels are used, the
audio energy of a sample refers to the highest energy of any
channel.
-
totalSamplesDuration
of type double
-
Only exists for audio. Represents the audio duration of the receiving track. For
audio durations of tracks attached locally, see
RTCAudioSourceStats
instead.
Represents the total duration in seconds of all samples that have been received
(and thus counted by totalSamplesReceived
). Can be used with
totalAudioEnergy
to compute an average audio level over
different intervals.
-
framesReceived
of type unsigned
long
-
Only exists for video. Represents the total number of complete frames received on
this RTP stream. This metric is incremented when the complete frame is received.
-
decoderImplementation
of type DOMString
-
Identifies the decoder implementation used. This is useful for diagnosing
interoperability issues.
If too much information is given here, it increases the fingerprint surface.
Since it is only given for active tracks, the incremental exposure is small.