The Voice Browser DFP Framework

Informative note from the Voice Browser Working Group, 8 February 2006

1 DFP Overview

The DFP (Data Flow Presentation) framework, developed by the Voice Browser Activity Voice Browser Activity, explains how Voice Browser specifications can be used together to create modular voice applications.

The framework is composed of three layers:

2 Relationship to other Approaches

The DFP framework is an instance of Mod el-View-Controller (MVC) design pattern. The data layer instantiates MVC' s model, the flow layer instantiates the controller, and the presentation layer instantiates the view.

The DFP framework is also intended as a voice-centric instance of the Multimodal archtecture [MMIARCH] developed by th e W3C Multimodal Interaction Activity. The data layers are identical. The MMI's runtime framework corresponds to the flow layer. And MMI's modality components correspond to DFP's presentation components. Ongoing collaboration between the activities will further refine and clarify the alignment between these approaches.

3 Interface between Flow and Presentation Layers

The interface between flow and presentation components is defined in terms of invocation requests and their responses, as well as asynchronous notifications. In each case, these can be modelled as events, where the event has an event name and a data payload. The payload is modelled in terms of property name value pairs, where the name is a string, and the value can be an atomic type (e.g. string, integer or boolean) or a complex type (e.g. a nested properties structure). The precise format of the data payload is not yet decided.

A flow component can invoked a presentation by sending a 'start' event. The event needs to include sufficient information to start the presentation; for example, it may include a URI referencing a VoiceXML script and may also include information which is passed to this script upon initiation.

Once the presentation is started, a flow component may cancel the presentation by issuing a 'stop' event. Otherwise, the presentation runs until completion and a 'stopped' event is returned from the presentation to the flow component. The stopped event may include data collected during the presentation with the user. Prior to the presentation being stopped, the flow and presentation components may send each other 'update' events.

An example of this interface is where a CCXML component is a flow component and VoiceXML 2.1 is a presentation component. At some stage in the application flow, the CCXML script starts a VoiceXML presentation by executing a <dialogstart> element with a src attribute indicating the script to run. Once the presentation has completed, a dialog.exit event is returned to the CCXML component.

More advanced interaction with the presentation is possible in the DFP framework than is currently permitted with VoiceXML 2.0/2.1. Consequently, VoiceXML 3.0 may be enhanced with capabilities such as:

4 Benefits

With the DFP framework, developers are able to structure their application in a modular manner, where data, flow and presentations are expressed in components at the appropriate layer.

An application's flow can be expressed in terms of states in a flow component: for a given state, a presentation component is invoked and the results returned from presentation component triggers state transitions in the flow component. This enables a clear separation of flow from presentation within the application, and faciliates development of reusable presentation components (such as parameterized VoiceXML <form>s for credit-card collection, scrollable lists, etc) which can be invoked from a variety of flow components.

Application developers can also take advantage of flow components which support parallel invocation of presentations. For example, a SCXML flow component may start three presentation components executing at the same time; one presentation component presents background music, another continuously listens for an attention word, and the third component presents the application whose name is spoken by the user after speaking the attention word.

Finally, the framework promotes, but does not mandate, various application practises. The strong implication is that markup on each layer should only express what is appropriate at that layer. For example, presentation layer components should not express 'flow' concepts such as 'goto'. So instead of writing a single large VoiceXML presentation which uses <goto> to navigate between application states expressed as <form>s, the application could be written as a flow component and a set of 'micro-dialog' presentation components. For example, a SCXML flow component which has a set of states corresponding to application states, together with a set of (reusable) VoiceXML presentations composed of a single VoiceXML <form> to interact with the user and return results to the flow component's states. This modular approach faciliates application development, maintainance, debugging and reuse.