Position Paper of Panasonic Beijing Laboratory for W3C Workshop on Internationalizing the SSML

Hairong Xia

Panasonic Beijing Laboratory



1. Introduction to Panasonic Beijing Laboratory

Panasonic Beijing Laboratory (PBL), established in 2001, is an overseas laboratory of Matsushita Electric Industrial Co. Ltd. The goal of PBL is to develop next-generation human-machine interactive technologies for the future. Currently, we are focusing on the R&D of speech and language processing, robot control, and image processing, etc. Past projects include home network, sensor network and Digital TV.

As a Panasonic's pioneer laboratory in China, PBL participates in various technology activities and organizations with great passion, such as AVS and DTV standardization group. Meanwhile, PBL keeps good relationship with local universities and institutes by sponsoring national academic conferences. PBL is also willing to cooperate with other companies and organizations in order to promote the progress of technologies for humankind.

2. Speech Synthesis in PBL

Speech synthesis team belongs to the speech technology group in PBL. In 2004, Panasonic initiated a global speech project for high-performance speech technologies. Members in this global project are from Japan, USA (Panasonic Speech Technology Lab), and China (PBL).  

In PBL, research work on Mandarin synthesis has been continued for more than 2 years. Till now, we have developed a Mandarin unlimited TTS system that is based on large corpus, and the quality is comparable to some products in the industry according to our evaluation. Moreover, the development for small database system is ongoing.

3. Our Desired Topics in the Workshop

PBL is interested in the topic of representation of word and phrase boundaries and tones inflection for Chinese. Other topics are attractive to us too. We are thinking about the following extensions to the current SSML for Chinese.

3.1 Dialect selection

3.2 Pronunciation of character

3.3 Sound effect filter

3.4 Major phrase boundaries

3.5 Speaking style template

3.6 Element value: macro

3.7 Extension for say-as element: translation

4. Referece

[1] ZHANG Zirong, CHU min, "A statistical Approach for Grapheme-to-Phoneme Conversion in Chinese", Microsoft Research Asia website.