ACTION-2: Categorize and assess use cases on wiki

Categorize and assess use cases on wiki

State:
closed
Person:
Chris Grigg
Due on:
July 19, 2010
Created on:
July 12, 2010
Related emails:
No related emails

Related notes:

Use Cases outlined can be found on the AudioXG Wiki:
http://www.w3.org/2005/Incubator/audio/wiki/Use_Cases

Details have been duplicated for the record:
________________________________________________________________________________

Core Use Cases - v1.1

Chris Grigg has put together "Core Use Cases - v1.0" which consists of various use-cases-classes and their basic descriptions. This document is expected to shift somewhat: please feel free to offer comments, suggestions on the mailing list.


Core Use Case Classes

Usability/Accessibility Speech
UI Sounds
Basic Media Functionality
Interactive Audio Functionality
Audio Production Basics
Audio Effects II: Mastering
Audio Effects III: Spatial Simulation
Digital Sheet Music

Definitions used below:
"sound" means recorded audio, synthetic audio, or synthetic music content (sequence + instruments)
"spoken" mean speech content either as recorded audio or synthetic speech

Class 1. Usability/Accessibility Speech

These use cases address basic usability of web pages generally. I mention them since the XG Charter includes speech synthesis, however they may already be addressed in part by other W3C specs (for example CSS3 Speech, SSML, "Voice Browser" WG, Content Accessibility Guidelines, etc.).

Upon user click, mouseover, etc. (or as a preferences behavior):
Trigger spoken version of a web page's (or textual element's) text contents (for visually impaired users)
Trigger spoken help for a web page (or entry form)
On error:
Trigger spoken error message
Support for multiple alternate versions in different natural languages should be considered.


Class 2. UI Sounds

These use cases bring to web apps the basic UI aural feedback (AKA 'sonification') typical of native apps and games. They may already be addressed in part by the HTML5 event handling model.

Trigger one or more sounds when:
User clicks (hovers, etc.) any given visual element within a web page
User presses Tab key to move to the next visual element within a web page
User presses a key while in a text entry field
A visual element within a web page changes its own state (open, resize, move, transition, close, etc.)
A window changes state (open, resize, move, transition, close, etc.)
(Etc.)

Class 3. Basic Media Functionality

These use cases bring simple audio-for-visual-media behaviors to web applications. They may already be addressed in part by the HTML5 event handling model.

Automatically trigger one or more sounds:
In synch with an animated visual element (animated GIF, SVG, Timed text, etc.)
As continuous background soundtrack when opening a web page, window, or site
Connect user events in visual elements (click etc.) to:
sound element transport controls
(play/pause/stop/rewind/locate-to/etc.)
music synthesis events
(note-on/note-off/control-change/program-change/bank-load/etc.)
Upon user click, mouseover, etc. (or as a preferences behavior):
Trigger speech containing additional informational content not present in page text (or visual element)
(consider multiple alternate versions in different natural languages)

Class 4. Interactive Audio Functionality

These use cases support common user expectations of game audio, but can also improve the user experience for traditional web pages, sites, and apps. Interactive audio can be defined as (i) sound that changes based upon the current game/app state, and (ii) sound with enough built-in variation to reduce listener fatigue that would otherwise occur over the long timespans typical of many games.

Branching sounds (one-shot -- selection among multiple alternate versions)
Branching sounds (continuous -- selection among multiple alternate next segments)
Parametric controls (mapped to audio control parameters like gain, pan/position, pitch, processing, etc.)
Note: This functionality may require either defining new media type(s), or perhaps a change to the <audio> element semantics. In interactive audio, a sound is not the same as a single playable media file; typically a sound (or 'cue') is some kind of bag of pointers to multiple playable audio files, plus some selection logic and/or parameter mapping logic.


Class 5. Audio Production Basics

For sounds that are generated (or in some cases combined) in real time, these use cases support common listener expectations of well produced music and sound.

Mixing:
By default, multiple sources + effects combine to one output
By default, audio sources' effects sends combine to designated effects
<video> elements with audio output are treated as audio sources
(maybe submixes, but this is more advanced == less important)
Audio Effects I:
Reverb (studio production types, not physical space simulations)
Equalization
(maybe Delays, Chorus, etc. but this is more advanced == less important)
These effects may be usefully applied on a per-source basis, on an effects send basis, on a submix output, or on the master mix output.

Note: In many cases recorded music, sound effects, and speech will (or can be made to) incorporate their own production effects, and therefore will not need API support.

Note: We could stop after Class 5 and still support most game genres.


Class 6. Audio Effects II: Mastering

For sounds that are generated (or in some cases combined) in real time, these use cases support a higher level of listener expectations of well produced music and sound, and may also increase intelligibility generally.

Dynamics (compression, limiting)
Aural enhancement (exciters, etc.)
Mastering functionality is more advanced == less important than the above classes.

Note: In many cases recorded music, sound effects, and speech will (or can be made to) incorporate their own mastering effects, and therefore will not need API support.

Class 7. Audio Effects III: Spatial Simulation

For those users listening in stereo, 3D spatialization causes a sound source (or submix) to appear to come from a particular position in 3D space (direction & distance). This functionality can be useful in some game types where 3D position changes in real time, and in teleconferencing where spatializing each speaker to a different static position can help in discriminating who is speaking. Environmental reverb provides clues as to the size and character of the enclosing space (room/forest/cavern, etc.), supporting a more immersive gaming experience.

3D spatialization
Environment simulation reverb
Spatial simulation is more advanced == less important than the above classes.

Class 8. Digital Sheet Music

Audio is one way to represent music in computer format. Another way is through a symbolic representation of music. Symbolic representations directly represent musical concepts relevant to performers, such as pitches, rhythms, and lyrics. Music notation, or sheet music, is used by performers of classical music, film music, and other musical repertoires.

Existing solutions for displaying and playing digital sheet music in the browser (e.g. Sibelius Scorch, MusicNotes, FreeHand Solero, Legato, Noteflight, Myriad Music Plug-in) use either browser-specific plug-ins or Flash. Digital sheet music developers are looking for more browser-standard approaches for displaying and playing sheet music on the widest possible variety of devices. In particular, mass-market tablet devices have the potential to serve as electronic music stands, displacing the user of paper sheet music over time.

The MusicXML format has established itself as the de-facto standard format for music notation, supported by over 125 different notation programs. The format is available under a royalty-free license modeled on the W3C license. Display and playback of the MusicXML format would make this feature useful to a wide variety of applications and developers. Extensibility so that individual vendors could add support for proprietary formats may also be desirable.

Alistair MacDonald, 20 Aug 2010, 23:55:08

Display change log.


Chair, Coralie Mercier <coralie@w3.org>, Staff Contact
Tracker: documentation, (configuration for this group), originally developed by Dean Jackson, is developed and maintained by the Systems Team <w3t-sys@w3.org>.
$Id: 2.html,v 1.1 2012/04/02 16:56:54 vivien Exp $