20327 – Continuous splice flag

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 20327 - Continuous splice flag

Summary: Continuous splice flag

Status:	RESOLVED FIXED

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	Media Source Extensions (show other bugs)
Version:	unspecified
Hardware:	All Windows 3.1

Importance:	P2 normal
Target Milestone:	---
Assignee:	Aaron Colwell (c)
QA Contact:	HTML WG Bugzilla archive list

URL:
Whiteboard:
Keywords:

Depends on:	19673
Blocks:
	Show dependency tree / graph

Reported:	2012-12-10 17:58 UTC by Pierre Lemieux
Modified:	2013-02-19 00:47 UTC (History)
CC List:	4 users (show)

See Also:

Attachments

Description Pierre Lemieux 2012-12-10 17:58:36 UTC

Implementation behavior at a splice point between two Media Segments may depend on whether the contents of the two Media Segments are identical around the splice points, i.e. whether they were intended to be spliced. For instance, if the two Media Segments are identical around a splice point, then the implementation can avoid applying additional processing such as cross-fades.

A continuousSplice boolean attribute could be added to SourceBuffer interface.

This seems like an implementation detail that wouldn't be visible from the users point of view. Worst case scenario a cross-fade between identical data will occur which wouldn't be preceptable. This only would happen for a single codec frame or two which doesn't seem like it would be to taxing on UA resources. 

Unless I'm missing something, I don't think this is worth modifying the SourceBuffer interface for.

Comment 2 Pierre Lemieux 2012-12-10 19:52:28 UTC

Right. This is not about avoiding taxing UA resources, but rather giving a chance to the UA to avoid audible artifacts at splice points unless necessary -- the UA can always ignore the flag.

When processing audio splices in the baseband domain (whether from native baseband signals or obtained from coded signals), the UA will need to introduce a fade of some sort (which can be audible if there is energy) unless it is told that the audio signal is identical on each side of the splice.

The purpose of the continuousSplice flag would be to signal to the UA that the audio signal is identical on each side of the splice. The UA could try to determine on its own whether the signals are identical, but then we would need to define what "identical" means.

(In reply to comment #2)
> Right. This is not about avoiding taxing UA resources, but rather giving a
> chance to the UA to avoid audible artifacts at splice points unless
> necessary -- the UA can always ignore the flag.
> 
> When processing audio splices in the baseband domain (whether from native
> baseband signals or obtained from coded signals), the UA will need to
> introduce a fade of some sort (which can be audible if there is energy)
> unless it is told that the audio signal is identical on each side of the
> splice.
> 
> The purpose of the continuousSplice flag would be to signal to the UA that
> the audio signal is identical on each side of the splice. The UA could try
> to determine on its own whether the signals are identical, but then we would
> need to define what "identical" means.

I don't understand. If the audio is identical then how would the fade be audible? I'm assuming the fade would be something like out[i] = a * frame1[i] + (1-a) * frame2[i]. If these two frames contain the same data then I don't think this would be audible. If there is any sort of level shift, difference in quantization or something similar, I would think that you'd want the fade there to make the transition less jarring.

Can you please provide a concrete use case where this would be used and how the current spec would lead to an unacceptable experience.

Comment 4 Pierre Lemieux 2012-12-10 22:51:28 UTC

> I'm assuming the fade would be something like
> out[i] = a * frame1[i] + (1-a) * frame2[i].

Right. Mandating (or strongly recommending) linear crossfades would resolve this issue.

The idea was to give implementations the option of using other types of cross-fades, e.g. constant power or Kaiser-Bessel, which may be more desirable.

Some others may need to weigh in on this, but this still doesn't seem very compelling to me. I feel like only an extremely tiny fraction of users would actually even use this and it isn't clear to me how perceptible the difference would be since the content is supposed to be identical. It seems odd that crossfading algorithms would introduce significant artifacts when the two inputs are identical. 

I feel like we should wait until there is more implementation experience before we add something like this.

Comment 6 Pierre Lemieux 2012-12-11 05:09:52 UTC

> I feel like we should wait until there is more implementation
> experience before we add something like this.

Ok with me.

> it isn't clear to me how perceptible the difference would be since the content is supposed to be identical.

I have uploaded at the link below a WAV file containing a single 440 Hz sine wave with a single 5 ms equal-power crossfade generated using Pro Tools -- simulating a splice with identical content on each side.

https://docs.google.com/open?id=0Bz7s0dhnv-7HZHhSdXg1dWxLOE0

An equal-gain crossfade would be inaudible as the original signal would be recovered exactly.

> It seems odd that crossfading algorithms would introduce
> significant artifacts when the two inputs are identical. 

The optimal cross-fade algorithm depends on the nature of the content on each side of the splice.

s_splice = a_left s_left + a_right s_right

For identical content, one should use equal-gain crossfade (a_right = a_left - 1), as you noted earlier, so that the exact signal is recovered.

s_splice = a_left s_left + (1 - a_left) s_left = s_left

For uncorrelated content (<s_left s_right> = 0), equal-power crossfade (a_left^2 + a_right^2 = 1) can be used so that, assuming that the power of s_left and s_right are equal, power remains constant across the splice.

<s_splice^2> = a_left^2 <s_left^2> + a_left a_right <s_left s_right> + a_right^2 <s_right^2> = (a_left^2 + a_right^2) <s_left^2> = <s_left^2>

> I feel like only an extremely tiny fraction of users
> would actually even use this

Well, every DAW out there typically offers both equal-power and equal-gain crossfades.

A boolean flag doesn't appear to convey what you really want here. It seems that something like this would be better.

enum AudioCrossfadeType {
    "none",
    "linear",
    "equal-power"
};

attribute AudioCrossfadeType audioCrossfadeType;

Everytime an append causes a splice, the value of this attribute is stored with the splice point. This would allow different values to be used at different splice points. Implementations may throw an exception if an unsupported crossfade type is specified.

I'm still not totally sold yet, but I think this proposal is better at conveying what you are trying to accomplish.

Changes committed.
https://dvcs.w3.org/hg/html-media/rev/d5956e93b991

I've mandated a linear crossfade for now. I'd like to defer adding other types of crossfades until v2. Content providers can implement other types of fades by simply doing the splice earlier with identical content and then encoding the desired fade in the content itself.