W3C

Voice Browsing

The W3C Voice Browser Working Group created the Speech Interface Framework. This framework enables developers to create speech enabled applications that are based on Web technologies. This provides the developer with an environment that will be familiar to those who are familiar with Web development techniques.

What is Voice Browsing? - Part of W3C's One Web Vision

Voice Browsing refers to using speech to navigate an application. These applications are written using parts of the Speech Interface Framework. In much the same way that Web applications are written in HTML and are rendered in a Web browser, speech applications are written in VoiceXML and are rendered via a Voice Browser.

Real-world voice-driven Web applications abound, though people may not always realize they are interacting with a Web service; examples include airline departure and arrival information, banking transactions, automated phone appointment reminders, and automated telephone receptionists. By one estimate, over 85% of Interactive Voice Response (IVR) applications for telephones (including mobile) use W3C's VoiceXML standard.

What is Voice Browsing Used For?

There are 10 times as many phones in the world as connected PCs. Phones will become the major portal to the Web. Speech recognition is not yet widely associated with the 'visual Web', but this will change as devices continue to shrink and make keyboards impractical, and as cell phones become more prevalent in regions with low literacy rates.

Asking for directions while driving and hearing the response through speech synthesis illustrates how practical "hands-free" applications can be to mobile users. Voice applications also benefit people with some disabilities (such as vision limitations) and people who cannot read.

W3C considers voice access to be one piece of more general "multimodal" access, where users can use combinations of means to interact: voice input, speech feedback, electronic ink, touch input, and physical gestures (such as those used in some video games). The Voice Browser Working Group and the Multimodal Interaction Working Group are coordinating their efforts to make the Web available on more devices and in more situations.

Examples

Some possible applications of Voice Browsing include:

  • Accessing business information, including the corporate "front desk" asking callers who or what they want, automated telephone ordering services, support desks, order tracking, airline arrival and departure information, cinema and theater booking services, and home banking services such as transferring money from one account to another, purchasing an item, trading stock.
  • Accessing public information, including community information such as weather, traffic conditions, school closures, directions and events; local, national and international news; national and international stock market information; and business and e-commerce transactions.
  • Accessing personal information, including calendars, address and telephone lists, to-do lists, shopping lists, and calorie counters.
  • Assisting the user to communicate with other people via sending and receiving voice-mail and email messages.

Learn More

The mission of the Voice Browser Working Group is to enable users to speak and listen to Web applications by creating standard languages for developing Web-based speech applications. The Voice Browser Working Group concentrates on languages for capturing and producing speech and managing the dialog between user and computer, while a related Group, the Multimodal Interaction Working Group, concentrates on additional input modes including keyboard and mouse, ink and pen, etc.

Visit the Voice Browser Activity home page for more information.

Current Status of Specifications

Learn more about the current status of specifications related to:

These W3C Groups are working on the related specifications:


Current Status

Use It