Re: [css3-speech] Volume properties from Daniel Weck on 2011-10-19 (www-style@w3.org from October 2011)

From: Daniel Weck <daniel.weck@gmail.com>
Date: Thu, 20 Oct 2011 00:02:27 +0100
To: Christoph Päper <christoph.paeper@crissov.de>
Cc: W3C style mailing list <www-style@w3.org>
Message-Id: <263C8F9D-69DF-48B7-ACDC-E76C782E47EC@gmail.com>
Hello Christoph,
just a heads-up to let you know that the CSS Working Group has reviewed this issue, and the consensus is to keep the "voice-volume" property name as it is now. As per the W3C process, you may choose to accept this resolution, or you may decide to raise an objection. Please let us know, and thanks again for your other (useful) comments!

For your information, there is a public wiki page for the Disposition Of Comments pertaining to the Last Call Working Draft of the CSS Speech Module, Level 3:

http://wiki.csswg.org/spec/css3-speech?&#issue-3

I would also like to point-out that there is an opportunity for several groups to collaborate and to address the challenges of a high-level, unified sound mixing architecture for web content. You might be interested in giving your input / feedback. I envision these groups to be involved in the discussion: CSS-Speech, the Audio Working Group, the HTML-Speech Incubator Group, and HTML5 (audio/video element, etc.). Here is a related email thread:

http://lists.w3.org/Archives/Public/www-style/2011Oct/0566.html

Kind regards, Dan

On 16 Oct 2011, at 22:03, Daniel Weck wrote:

> Hello Christoph, could you please confirm whether or not you are okay with keeping the name of the 'voice-volume' property as it is now? Many thanks! Regards, Daniel
> 
> On 8 Sep 2011, at 17:45, Daniel Weck wrote:
> 
>> Hello Christoph,
>> thank you for your ideas about the next generation of CSS Speech (Level 4+). That's certainly good food for thoughts, useful for when 3D spatial audio will be discussed in Level 4.
>> 
>> With regards to your suggestion to rename 'voice-volume' to 'volume-voice': I think you are making the premature assumption that a shorthand value will naturally cover the needs of both Text-To-Speech (which is the scope of the current CSS Speech specification) and more generic multimedia control (e.g. CSS properties to control embedded document audio/video, either authored in markup or injected via the "content:" predicate). I think that the dichotomy between these two domains will remain. The 'cue' feature (pre-recorded audio clips) in CSS Speech is very specific to how a screen-reader-like aural experience is structured (not limited to accessibility needs, as you know), thus why this is encapsulated within the "aural box model" right now. You mention future consistency as an argument, but I actually think that consistency in the current Level 3 context is important, so I am not in favor of renaming 'voice-volume'.
>> 
>> Let us know if this is a satisfactory answer.
>> Regards, Daniel
>> 
>> On 19 Aug 2011, at 14:07, Christoph Päper wrote:
>> 
>>> Bert Bos:
>>> 
>>>> http://www.w3.org/TR/2011/WD-css3-speech-20110818/
>>> 
>>> The introductionary section 4 really helps CSS-savvy people to quickly grasp essential concepts, thanks.
>>> 
>>> Of course, only one dimension of the 2D visual box model – it’s 2½D if you count in ‘z-index’ and table layout – can map to the 1D aural box model. The editors chose the vertical domain (‘top’/ [‘middle’] / ‘bottom’) for the diagram over the horizontal (‘right’ / [‘center’] / ‘left’) and logical one (‘start’ / ‘end’), although they use the second set of logical terms (‘before’ / ‘after’), which makes sense, because we do only have ‘::before’ and ‘::after’ for runs of text, which are also 1D (although their consituents are 2D in visual media).
>>> 
>>> Hm, that became more a note to myself. Originally, I wanted to suggest to use ‘margin-before’ etc. instead, if that was agreed upon. A future Box module should cover all models and dimensions:
>>> * 1D temporal                – t
>>> * 1D linear                  – x
>>> * 1½D temporal layers        – tv
>>> * 1½D linear layers          – xv
>>> * 2D planar                  – xy
>>> * 2D marquee                 – tx
>>> * 2½D planar layers          – xyv
>>> * 2½D marquee layers         – txv
>>> * 3D spatial                 – xyz
>>> * 3½D temporal-planar layers – txyv
>>> * 4D temporal-spatial        – txyz
>>> 
>>> The Speech module currently is “t”, but should probably become “tv” in the future, see below.
>>> 
>>>> The module contains the properties to style how a document is spoken by 
>>>> a speech synthesizer: voice, volume, speed, pauses, cue sounds, etc.
>>> 
>>> I believe in level 4 this module should be split into two, separating aural and speech (synthesis) properties, because one might want to do, following Example XIII,
>>> 
>>> @media aural { article#poem {
>>>  content: url(Poem_performance.audio);
>>>  audio-volume: soft;
>>> }}
>>> @media speech { article#poem {
>>>  background: url(ambience.audio);
>>>  volume: loud /* speech */ x-soft /* background */;
>>> }}
>>> 
>>> or
>>> 
>>> …
>>>  voice-volume: loud;
>>>  voice-opacity: 80% /* or 0.8 */;
>>> }}
>>> 
>>> Therefore I think ‘volume-voice’ makes more sense than ‘voice-volume’, because then we can more consistently introduce a shorthand property ‘volume’ later.
>>> 
>>> PS: You could extend that example for other media
>>> 
>>> @media print {article#poem {
>>>  content: url(Poem_calligraphy.image);
>>> }}
>>> @media projection {article#poem {
>>>  content: url(Poem_illustrated.animation);
>>> }}
>>> @media tv {article#poem {
>>>  content: url(Poem_illustrated_performance.video);
>>> }}
>>> @media screen {article#poem {
>>>  content: url(Poem_experience.app);
>>> }}
>>> 
>>> 
>> 
>
Received on Wednesday, 19 October 2011 23:03:10 UTC