Position Paper for

W3C Workshop on Internationalizing

the Speech Synthesis Markup Language (SSML)

 

2-3 November 2005

Beijing, China

Title:            SSML Extension for Korean

Source:        Korea Telecom (KT)

                     Information and Communications University (ICU)*

Author:         Sang-Jin Kim*, Myoung-Wan Koo, Jae-In Kim, and Minsoo Hahn*

 

 

1. Introduction

The Speech Synthesis Markup Language Specification Version 1.0 (SSML) is designed to provide a rich, XML-based markup language for assisting the generation of synthetic speech in Web and other applications. However, SSML doesn't fully consider non-English languages, so enhancements of internationalization extensions are required. This document specifies the SSML extension for Korean.

 

2. Characteristic of Korean                                                                    

The Korean character, Hangul, consists of forty letters. Twenty-one of them represent vowels, including thirteen diphthongs, and nineteen represent consonants. Phonemes are combined to form a syllable, and several syllables are combined to form a word phrase (Eojeol in Korean) which is different from a phrase in English. The syllable structures of Korean are V, CV, VC, and CVC, where C and V stands for consonant and vowel, respectively.

Korean and Japanese are completely different languages except for the grammatical structure. Korean is also completely different from Chinese, even though Korean has borrowed many Chinese words and some Chinese characters.

 

3. SSML Extension for Chinese Characters in Korean

3.1 Chinese Characters in Korean

Present Korean and Japanese use many Chinese characters. Pronunciation of the characters, however, is different. For example, (love) is said as /ä/ in Korean, but /ai/ in Japanese. And in many cases, same character is represented differently according to the country. For example, (car), (wind) is used in Korea, but simplified character, (car), (wind) is used in China, respectively. These simplified characters are not used in Korea.

Although we can write text only with Korean characters, still it is not unusual to use Chinese characters as well. For example,

ѱԴϴ(This language is Korean),

޴ Դϴ(This language is Korean),

() ѱ()Դϴ(This language is Korean).

The pronunciations of them are exactly same. Chinese characters are semantic symbols, but Korean characters are phonetic symbols. Chinese word (Korean) is read  as /ѱ(Korean)/ and its phonetic list is ++++++. The input text for text-to-speech(TTS) system has to be converted into a phonetic list, so if Chinese characters are mixed with Korean characters, they have to be substituted to Korean. We dont use all Chinese characters, rather there is a frequently-used-Chinese-character-list recommended by our Korean government and its size is 2000. Since the pronunciations of them are different from Chinese and Japanese, we need to utilize this list and their pronunciations in the Korean TTS system.

3.2 Chinese Characters Problem for a Persons Name, Place Name, or Proper Noun

: ,

: ϰ, ¡

: , ƻ

: ,

Although we use Chinese characters, the pronunciations are different from Chinese, and Japanese. In case of proper noun, such as persons names, place names, it is recommended to pronounce in original way. For example, is the capital city of China. Chinese says is as /Beijing/, but Korean says /Bookgyoung/ in Korean way. If the words are not a proper noun, we dont need to fix the pronunciation, because the Korean way of Chinese pronunciation is a part of Korean language. However, in cases of proper nouns originated from China, we need to pronounce in Chinese way. This is also applied to Japanese. is the capital city of Japan. Japanese says it as /Tokyo/. The correct pronunciation of these Chinese characters is /Tokyo/.

3.3 SSML Extension for Chinese characters in Korean

There is a lexicon element in SSML recommendation. This element specifies the location of the pronunciation lexicon file. If the Chinese characters have to be pronounced in Korean pronunciation, lang=ko-CN tag has to be specified, lang=ja-CN tag for Japanese pronunciation, and so on.

<lexicon lang=ko uri=http://www.multilingual.org/lexicon.file>

<lexicon lang=ko-CN uri=http://www.multilingual.org/Chinese_lexicon_freq_KR.file>

 

<lexicon lang=ja-KR uri=http://www.multilingual.org/Chinese_lexicon_JP.file>

<lexicon lang=cn-KR uri=http://www.multilingual.org/Chinese_lexicon_CN.file>

 

 

4. SSML Extension for Homograph Words in Korean

4.1 Same word, different pronunciation, different meaning in Korean

Some words can be pronounced with different durations in Korean although they have same characters. And the meaning of the word becomes different one according to its duration. For example, (/nun/) in Korean means human eye while the same word (/nu:n/) with a longer duration of /u:/ sound becomes a confusing word, that means, snow. Since characters are same, additional information is required for TTS to synthesize speech appropriately.

(/nun/ =eye) or (/nu:n/=snow),

(/mal/=horse) or (/ma:l/=speech),

ȭ(/whajang/=make-up, toilet) or ȭ(/wha:jang /=cremation),

(/kajang/=most, exceedingly) or (/ka:jang /=disguise, feint).

 

4.2 SSML Extension for Homographs Words in Korean

Only the difference for these words is the duration in pronunciation. It is necessary to give the duration information to a TTS system for these kinds of words. SSML recommendation supports say-as element and sub element, these elements cannot handle the above problem successfully. This is different from the Kanji and Kana problem in Japanese. We suggest tone tag for this problem. Attribute values for tone element are long, short and default would be enough for Korean.

޸ . (=Speech is faster than a running horse)

<tone type=long ></tone> ޸ <tone type=default ></tone> .

 

5. Conclusion

Present Korean usually uses Chinese words and the characters, pronunciations are different from Japanese and Chinese. We need a pronunciation lexicon of Chinese characters for successful Korean TTS systems. That is why we are proposing the lang attribute in the lexicon element. And we also propose type=long attribute for tone element if this element would be accepted as a new element for the tonal language.

 

References

 

[1] Speech Synthesis Markup Language (SSML) Version 1.0 W3C Recommendation 7 September 2004.

[2] Ҹ (Understanding of Korean Speech), Ji-young Shin, Hankook-moonwhasa, 2000.