TTML/changeProposal014

From W3C Wiki
Jump to: navigation, search

< Change Proposal Index

Audio Rendering - OPEN

  • Owner: Nigel Megitt.
  • Started: 14/06/13

Issues Addressed

ISSUE-10 - OPEN

Summary

TTML is currently predicated on the visual representation of timed text. It may be useful additionally to present the timed text in audio, e.g. for accessibility purposes. That audio presentation may have been pre-recorded or generated by text to speech technology, in both cases using the document instance as a source. Issue-10 concerns links to external audio resources. Additionally, if the TTML is to be used as a source for generating audio tracks for "audio description" (European term) or "video description" (equivalent US term) then extra markup may be useful to guide the conversion to audio, regardless of whether that converter is a human or a machine. For example pronunciation guidance may be needed.

For both the above use cases and more general metadata capture for subtitles and captions, markup of emotion would also be a useful addition to TTML.

Pronunciation markup may be referenced as an external PLS document.

Emotions may be expressed using EmotionML, however further work is needed to define how to extend TTML with EmotionML semantics.

Speech synthesis markup is probably not needed here - if it were needed, SSML would appear to be relevant.

Edits to be applied

Strawman

Pronunciation lexicon

Add a lexicon element to tt:head using the same semantics as in SSML §3.1.5.1

Emotion markup

TODO
Find a way to integrate EmotionML semantics into TTML

Map external audio cue fetch and playback to Javascript

use the onenter associated with cues, and getCueAsAudio(), thus:
    cue.addEventHandler("enter", 

          function(sender)
          {
              var theVideo = document.getElementByName("theVideo");
              var savedVolume = theVideo.volume;
              theVideo.mute();
              // set watchdog on video in case it overruns the description duration
              var h = theVideo.addEventHandler("timeUpdate", 
                    function(video) {
                        if(video.currentTime > cue.endTime) video.pause();
                    }
              var myAudio = sender.getCueAsAudio();  // if this is too slow do outside handler
              myAudio.addEventHandler("ended", 
                    function(description) {
                        theVideo.removeEventHandler(h);
                        theVideo.Volume = savedVolume;
                        if(theVideo.paused) theVideo.play();
                    }
              myAudio.play();
          }
    } 

Edits applied

Impact

References

Pronunciation Lexicon Specification
PLS.
Emotion Markup Language
EmotionML.
Speech Synthesis Markup Language
SSML.