Copyright © 2008 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
This document defines the syntax for specifying pronunciation lexicons to be used by Automatic Speech Recognition and Speech Synthesis engines in voice browser applications.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is the 18 August 2008 W3C Proposed Recommendation of "Pronunciation Lexicon Specification (PLS) Version 1.0". W3C publishes a technical report as a Proposed Recommendation to indicate that the document is a mature technical report that has received wide review for technical soundness and implementability and to request final endorsement from the W3C Advisory Committee. Proposed Recommendation status is described in section 7.1.1 of the Process Document.
The W3C Membership and other interested parties are invited to review the document and send comments to the Working Group's public mailing list www-voice@w3.org (archive) until 18 September 2008, 23:59 EDT. See W3C mailing list and archive usage guidelines. Advisory Committee Representatives should consult their WBS questionnaires .
The Voice Browser Working Group believes that this specification addresses its requirements and all Last Call and Candidate Recommendation issues. Known implementations are documented in the PLS 1.0 Implementation Report, along with the associated test suite.
Since the Candidate Recommendation in December 2007, the following changes were applied to the specification: updated definition of URI (Section 1.5), clarified usage of white space in IPA transcriptions (Section 2), clarified definition of xml:base attribute (Section 4.1), then applied minor editorial changes. Changes from the previous Working Draft can be found in Appendix D. Please check the Disposition of Comments received during the Candidate Recommendation period.
Section 2. Pronunciation Alphabets describes
  the legal values of the alphabet attribute for
  specifying a pronunciation alphabet. The Working Group is
  requesting the creation of a Pronunciation Alphabet registry with
  IANA so that pronunciation alphabets other than "ipa" can be also
  used. The location of the registry will be provided at http://www.w3.org/2001/10/synthesis
  when the registry becomes available. A future version of the PLS
  specification may permit values from this registry to be used in
  the alphabet attribute.
This document has been produced as part of the W3C Voice Browser Activity, following the procedures set out for the W3C Process.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
Publication as a Proposed Recommendation does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
<lexicon> Element<meta> Element<metadata> Element<lexeme> Element<grapheme> Element<phoneme> Element<alias> Element<example> ElementThis section is informative.
The accurate specification of pronunciation is critical to the success of speech applications. Most Automatic Speech Recognition (ASR) and Text-To-Speech (TTS) engines internally provide extensive high quality lexicons with pronunciation information for many words or phrases. To ensure a maximum coverage of the words or phrases used by an application, application-specific pronunciations may be required. For example, these may be needed for proper nouns such as surnames or business names.
The Pronunciation Lexicon Specification (PLS) is designed to enable interoperable specification of pronunciation information for both ASR and TTS engines. The language is intended to be easy to use by developers while supporting the accurate specification of pronunciation information for international use.
The language allows one or more pronunciations for a word or phrase to be specified using a standard pronunciation alphabet or if necessary using vendor specific alphabets. Pronunciations are grouped together into a PLS document which may be referenced from other markup languages, such as the Speech Recognition Grammar Specification [SRGS] and the Speech Synthesis Markup Language [SSML].
In its most general sense, a lexicon is merely a list of words or phrases, possibly containing information associated with and related to the items in the list. This document uses the term "lexicon" in only one specific way, as "pronunciation lexicon". In this particular document, "lexicon" means a mapping between words (or short phrases), their written representations, and their pronunciations suitable for use by an ASR engine or a TTS engine. Pronunciation lexicons are not only useful for voice browsers; they have also proven effective mechanisms to support accessibility for persons with disabilities as well as greater usability for all users. They are used to good effect in screen readers and user agents supporting multimodal interfaces.
A TTS engine aims to transform input content (either text or markup, such as SSML) into speech. This activity involves several processing steps:
SSML enables a user to control and enhance TTS activity by acting through SSML elements on these levels of processing (see [SSML] for details).
PLS is intended to be the standard format of the documents
  referenced by the <lexicon> element of SSML (see 
  Section 3.1.4 of [SSML]).
The following is a simple example of an SSML document. It includes an Italian movie title and the name of the director to be read in US English.
<?xml version="1.0" encoding="UTF-8"?>
<speak version="1.0" 
    xmlns="http://www.w3.org/2001/10/synthesis" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
      http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
    xml:lang="en-US">
    
    The title of the movie is: "La vita è bella" (Life is beautiful),
    which is directed by Roberto Benigni. 
</speak>
  To be pronounced correctly the Italian title and the director's name might include the pronunciation inline in the SSML document.
<?xml version="1.0" encoding="UTF-8"?>
<speak version="1.0" 
    xmlns="http://www.w3.org/2001/10/synthesis" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
      http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
    xml:lang="en-US">
    
    The title of the movie is: 
    <phoneme alphabet="ipa" ph="ˈlɑ ˈviːɾə ˈʔeɪ ˈbɛlə">"La vita è bella"</phoneme>
    <!-- The IPA pronunciation is:
    "ˈlɑ ˈviːɾə
     ˈʔeɪ ˈbɛlə" --> 
    (Life is beautiful),
    which is directed by 
    <phoneme alphabet="ipa" ph="ɹəˈbɛːɹɾoʊ bɛˈniːnji">Roberto Benigni.</phoneme>
    <!-- The IPA pronunciation is:
    "ɹəˈbɛːɹɾoʊ
     bɛˈniːnji" --> 
</speak>
  Using PLS, all the pronunciations can be factored out into an
  external PLS document which is referenced by the <lexicon> element of SSML (see 
  Section 3.1.4 of [SSML]).
<?xml version="1.0" encoding="UTF-8"?>
<speak version="1.0" 
    xmlns="http://www.w3.org/2001/10/synthesis" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
      http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
    xml:lang="en-US">
    <lexicon uri="http://www.example.com/movie_lexicon.pls"/>
    The title of the movie is: "La vita è bella" (Life is beautiful),
    which is directed by Roberto Benigni. 
</speak>
  The referenced lexicon might look something like this:
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>La vita è<!-- same as: è --> bella</grapheme>
    <phoneme>ˈlɑ ˈviːɾə ˈʔeɪ ˈbɛlə</phoneme>
    <!-- IPA string is:
     "ˈlɑ ˈviːɾə
      ˈʔeɪ ˈbɛlə" --> 
  </lexeme>
  <lexeme>
    <grapheme>Roberto</grapheme>
    <phoneme>ɹəˈbɛːɹɾoʊ</phoneme>
    <!-- IPA string is:
     "ɹəˈbɛːɹɾoʊ" --> 
  </lexeme>
  <lexeme>
    <grapheme>Benigni</grapheme>
    <phoneme>bɛˈniːnji<!-- IPA string is:
     "bɛˈniːnji" --></phoneme>
  </lexeme>
</lexicon>
  The PLS engine will load the external PLS document and transparently apply the pronunciations during the processing of the SSML document. An application may contain several distinct PLS documents to be used at different points within the application. Section 3.1.4 of [SSML] describes how to use more than one lexicon document referenced in a SSML document.
Given that many platform/browser/text editor combinations do not correctly cut and paste Unicode text, IPA symbols may be entered as numeric character references (see Section 4.1 on Character and Entity References of either XML 1.0 [XML10] or XML 1.1 [XML11]) in the pronunciation. However, the UTF-8 representation of an IPA symbol should always be used in preference to its numeric character reference. In order to overcome potential problems with viewing the UTF-8 representation of IPA symbols in this document, pronunciation examples are also shown in a comment using numeric character references.
An ASR engine transforms an audio signal into a recognized sequence of words or a semantic representation of the meaning of the utterance (see Semantic Interpretation for Speech Recognition [SISR] for a standard definition of Semantic Interpretation).
An ASR grammar is used to improve ASR performance by describing the possible words and phrases the ASR might recognize. SRGS is the standard definition of ASR grammars (see [SRGS] for details).
PLS may be used by an ASR processor to allow multiple pronunciations of words and phrases, and also to do limited text normalization, such as acronym expansion and abbreviations.
PLS entries are applied to the graphemes inside SRGS grammar rules to convert them into the phonemes to be recognized. See the example below and the example in Section 1.3 for a PLS document used for both ASR and TTS.
There might be other uses of PLS, for instance in a dictation system or for unconstrained ASR, which might be beyond the scope of this specification.
This is a very simple SRGS grammar that allows the recognition of sentences like "Boston Massachusetts" or "Miami Florida".
<?xml version="1.0" encoding="UTF-8"?>
<grammar version="1.0"
  xmlns="http://www.w3.org/2001/06/grammar"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
  xsi:schemaLocation="http://www.w3.org/2001/06/grammar 
    http://www.w3.org/TR/speech-grammar/grammar.xsd"
  xml:lang="en-US" root="city_state" mode="voice">
  <rule id="city" scope="public">
    <one-of> <item>Boston</item> 
             <item>Miami</item> 
             <item>Fargo</item> </one-of> 
  </rule>
  <rule id="state" scope="public">
    <one-of> <item>Florida</item>
             <item>North Dakota</item>
             <item>Massachusetts</item> </one-of>
  </rule> 
  
  <rule id="city_state" scope="public"> 
     <ruleref uri="#city"/> <ruleref uri="#state"/>
  </rule>
</grammar>
  If a pronunciation lexicon is referenced by a SRGS grammar it can allow multiple pronunciations of the word in the grammar to accommodate different speaking styles. Here is the same grammar with a reference to an external PLS document.
<?xml version="1.0" encoding="UTF-8"?>
<grammar version="1.0"
  xmlns="http://www.w3.org/2001/06/grammar"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
  xsi:schemaLocation="http://www.w3.org/2001/06/grammar 
    http://www.w3.org/TR/speech-grammar/grammar.xsd"
  xml:lang="en-US" root="city_state" mode="voice">
  
  <lexicon uri="http://www.example.com/city_lexicon.pls"/>
  <rule id="city" scope="public">
    <one-of> <item>Boston</item> 
             <item>Miami</item> 
             <item>Fargo</item> </one-of> 
  </rule>
  <rule id="state" scope="public">
    <one-of> <item>Florida</item>
             <item>North Dakota</item>
             <item>Massachusetts</item> </one-of>
  </rule> 
  
  <rule id="city_state" scope="public"> 
     <ruleref uri="#city"/> <ruleref uri="#state"/>
  </rule>
</grammar>
  Note also that an SRGS grammar might reference multiple PLS documents.
A VoiceXML 2.0 application ([VXML]) contains SRGS grammars for ASR and SSML prompts for TTS. The introduction of PLS into both SRGS and SSML will directly impact VoiceXML applications.
The benefits described in Section 1.1 and Section 1.2 are also available in VoiceXML applications. The application may use several contextual PLS documents at different points in the interaction, but may also use the same PLS document both in SRGS, to improve ASR, and in SSML, to improve TTS. Here is an example PLS document:
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>judgment</grapheme>
    <grapheme>judgement</grapheme>
    <phoneme>ˈdʒʌdʒ.mənt</phoneme>
    <!-- IPA string is:
    "ˈdʒʌdʒ.mənt" --> 
  </lexeme>
  <lexeme>
    <grapheme>fiancé</grapheme>
    <grapheme>fiance</grapheme>
    <phoneme>fiˈɒns.eɪ</phoneme>
    <!-- IPA string is:
    "fiˈɒns.eɪ" --> 
    <phoneme>ˌfiː.ɑːnˈseɪ</phoneme>
    <!-- IPA string is:
    "ˌfiː.ɑːnˈseɪ" --> 
  </lexeme>
</lexicon>
  which could be used to improve TTS as shown in the following SSML document:
<?xml version="1.0" encoding="UTF-8"?>
<speak version="1.0" 
    xmlns="http://www.w3.org/2001/10/synthesis" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
      http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
    xml:lang="en-US">
    <lexicon uri="http://www.example.com/lexicon_defined_above.xml"/>
    <p> In the judgement of my fiancé, Las Vegas is the best place for a honeymoon.
              I replied that I preferred Venice and didn't think the Venetian casino was an
              acceptable compromise.<\p>
</speak>
  but also to improve ASR in the following SRGS grammar:
<?xml version="1.0" encoding="UTF-8"?>
<grammar version="1.0"
    xmlns="http://www.w3.org/2001/06/grammar"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    xsi:schemaLocation="http://www.w3.org/2001/06/grammar 
      http://www.w3.org/TR/speech-grammar/grammar.xsd"
    xml:lang="en-US" root="movies" mode="voice">
  
  <lexicon uri="http://www.example.com/lexicon_defined_above.xml"/>
  <rule id="movies" scope="public">
     <one-of>
             <item>Terminator 2: Judgment Day</item> 
             <item>My Big Fat Obnoxious Fiance</item> 
             <item>Pluto's Judgement Day</item>
     </one-of> 
  </rule>
</grammar>
  The current specification is focused on the major features described in the requirements document [REQS]. The most complex features have been postponed to a future revision of this specification. Some of the complex features not included, for instance, are the introduction of morphological, syntactic and semantic information associated with pronunciations (such as word stems, inter-word semantic links, pronunciation statistics, etc.). Many of these features can be specified using RDF [RDF-XMLSYNTAX] that reference lexemes within one or more pronunciation lexicons.
anyURI primitive as defined
    in Section 3.2.17 of XML Schema Part 2: Datatypes [XML-SCHEMA-2]. For
    informational purposes only, [RFC3986] and [RFC2732] may be useful in understanding the
    structure, format, and use of URIs. Note that IRIs (see
    [RFC3987]) are
    permitted within the above definition of URI.A phonemic/phonetic
  alphabet is used to specify a pronunciation. An alphabet in
  this context refers to a collection of symbols to represent the
  sounds of one or more human languages. In the PLS specification
  the pronunciation alphabet is specified by the
  alphabet attribute (see Section
  4.1 and Section 4.6 for details on the
  use of this attribute). The only valid values for the
  alphabet attribute are "ipa" (see the
  next paragraph) and vendor-defined strings of the form
  "x-organization" or
  "x-organization-alphabet". For example, the Japan
  Electronics and Information Technology Industries Association
  [JEITA] might wish to encourage the use
  of an alphabet such as "x-jeita" or
  "x-jeita-2000" for their phoneme alphabet [JEIDAALPHABET]. Another example might be
  "x-sampa" [X-SAMPA], an
  extension of the SAMPA phonetic
  alphabet [SAMPA] to cover the entire
  range of characters in the International
  Phonetic Alphabet [IPA].
A compliant PLS processor MUST support "ipa" as the value
  of the alphabet attribute. This means that the PLS
  processor MUST support the Unicode representations of the
  phonetic characters developed by the International Phonetic
  Association [IPA]. In addition to an
  exhaustive set of vowel and consonant symbols, this character set
  supports a syllable delimiter, numerous diacritics, stress
  symbols, lexical tone symbols, intonational markers and more. For
  this alphabet, legal phonetic/phonemic values are strings of the
  values specified in Appendix 2 of [IPAHNDBK]; note that an IPA transcription may
  contain white space characters to assist readability, which have
  no implications for the pronunciation. Informative tables of the
  IPA-to-Unicode mappings can be found at [IPAUNICODE1] and [IPAUNICODE2]. Note that not all of the IPA
  characters are available in Unicode. For processors supporting
  this alphabet,
Note that there are peculiarities in the IPA alphabet which might have implications for implementers, for instance equivalent, withdrawn and superseded IPA symbols; see Appendix 2 of [IPAHNDBK] for further details.
When IPA symbols are used to represent the phonemes of a language, there can be an ambiguity concerning which allophonic symbol to select to represent a phoneme. Note that this may result in inconsistencies between lexicons which were composed for the identical language.
Currently there is no ready way for a blind or partially sighted person to read or interact with a lexicon containing IPA symbols. It is hoped that implementers will provide tools which will enable such an interaction.
A legal Pronunciation Lexicon Specification document
    MUST
    have a legal XML Prolog from Section 2.8 of either XML 1.0
    [XML10] or XML 1.1 [XML11]. The XML prolog is followed by the
    root <lexicon> element.
    See Section 4.1 for details on this
    element.
The <lexicon> element
    MUST
    designate the PLS namespace. This can be achieved by declaring
    an xmlns attribute or an attribute
    with an "xmlns" prefix. See Section 2 of Namespaces in XML
    (Namespaces in XML 1.0 [XML-NS10]
    or Namespaces in XML 1.1 [XML-NS11]) for details. Note that when the
    xmlns attribute is used alone, it sets
    the default namespace for the element on which it appears and
    for any child elements. The namespace for PLS is defined to be
    "http://www.w3.org/2005/01/pronunciation-lexicon".
It is RECOMMENDED that the <lexicon> element also indicate
    the location of the PLS schema (see Appendix
    A) via the 
    xsi:schemaLocation attribute from 
    Section 2.6.3 of XML Schema Part 1: Structures Second
    Edition [XML-SCHEMA-1].
The following is an example of a legal PLS header:
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  This section enumerates the conformance rules of this specification.
All sections in this specification are normative, unless otherwise indicated. The informative parts of this specification are identified by "Informative" labels within sections.
Individual conformance requirements or testable statements are identifiable in the PLS specification through imperative voice statements. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. However, for readability, these words do not appear in all uppercase letters in this specification.
A document is a Conforming Pronunciation Lexicon Specification Document if it meets both the following conditions:
<lexicon> root element as
      specified in Section 3.1.This specification and these conformance criteria provide no designated size limits on any aspect of PLS documents. There are no maximum values on the number of elements, the amount of character data, or the number of characters in attribute values.
The PLS namespace MAY be used with other XML namespaces as per the Namespaces in XML Recommendations (Namespaces in XML 1.0 [XML-NS10] or Namespaces in XML 1.1 [XML-NS11]). Future work by W3C is expected to address ways to specify conformance for documents involving multiple namespaces.
A Conforming Pronunciation Lexicon Specification Processor MUST be able to parse and process Conforming Pronunciation Lexicon Specification documents.
In a Conforming Pronunciation Lexicon Specification Processor, the XML parser MUST be able to parse and process all XML constructs defined by either XML 1.0 [XML10] or XML 1.1 [XML11] and conforms to the corresponding Namespaces in XML specification (Namespaces in XML 1.0 [XML-NS10] or Namespaces in XML 1.1 [XML-NS11]).
A Conforming Pronunciation Lexicon Specification Processor MUST conform to the XML 1.0 or XML 1.1 requirements for conformant non validating processors.
A Conforming Pronunciation Lexicon Specification Processor MUST correctly understand and apply the semantics of each markup element as described by this document.
A Conforming Pronunciation Lexicon Specification Processor MUST meet the following requirements for handling of natural (human) languages:
xml:lang attribute (on the
      <lexicon> element) has
      a value representing a natural (human) language that the
      Pronunciation Lexicon Specification Processor claims to
      support, the Processor is REQUIRED
      to successfully parse and treat all text encountered as if in
      that language in order to be a Conforming Processor.xml:lang attribute (on the
      <lexicon> element) has
      a value representing a natural (human) language that the
      Processor does not support.When a Conforming Pronunciation Lexicon Specification Processor encounters elements or attributes that are not declared in this specification and such elements or attributes occur where it is not forbidden in this specification, the processor MAY choose to:
Except where stated in this document, there is no conformance requirement with respect to performance of rendering pronunciations as acoustic structures (models, waveforms, etc.) for ASR and TTS.
The Pronunciation Lexicon markup language consists of the following elements and attributes:
| Elements | Attributes | Description | 
|---|---|---|
| <lexicon> | versionxml:basexmlnsxml:langalphabet | root element for PLS | 
| <meta> | namehttp-equivcontent | element containing meta data | 
| <metadata> | element containing meta data | |
| <lexeme> | xml:idrole | the container element for a single lexical entry | 
| <grapheme> | contains orthographic information for a lexeme | |
| <phoneme> | preferalphabet | contains pronunciation information for a lexeme | 
| <alias> | prefer | contains acronym expansions and orthographic substitutions | 
| <example> | contains an example of the usage for a lexeme | 
<lexicon> ElementThe root element of the Pronunciation Lexicon markup language
  is the <lexicon> element.
  This element is the container for all other elements of the PLS
  language. A <lexicon>
  element MUST contain zero or more <meta> elements, followed by an
  OPTIONAL <metadata> element, followed by
  zero or more <lexeme>
  elements. Note that a PLS document without any <lexeme> elements may be useful as
  a placeholder for future lexical entries.
The <lexicon> element
  MUST
  specify an alphabet attribute which indicates the
  default pronunciation alphabet to be used within the PLS
  document. The values of the alphabet attribute are
  described in Section 2. The default
  pronunciation alphabet MAY be overridden for a given lexeme using the
  <phoneme> element.
The REQUIRED version attribute indicates
  the version of the specification to be used for the document and
  MUST
  have the value "1.0".
The REQUIRED xml:lang attribute allows
  identification of the language for which the pronunciation lexicon is relevant. IETF
  Best Current Practice 47 [BCP47] is the normative reference on the values of the
  xml:lang attribute.
Note that xml:lang specifies a single unique
  language for the entire PLS document. This does not limit the
  ability to create multilingual SRGS
  [SRGS] and SSML
  [SSML] documents. These documents may
  reference multiple pronunciation
  lexicons, possibly written for different languages.
The namespace URI for PLS is
  "http://www.w3.org/2005/01/pronunciation-lexicon".
  All PLS markup MUST be associated with the PLS namespace, using a
  Namespace Declaration as described in either Namespaces in XML
  1.0 [XML-NS10] or Namespaces in XML
  1.1 [XML-NS11]. This can for instance
  be achieved by declaring an xmlns attribute on the
  <lexicon> element, as the
  examples in this specification show.
PLS documents MAY include the xml:base attribute as
  defined in [XML-BASE].
  Note that as in the HTML 4.01 specification [HTML], this is a URI which all the relative
  references within the document take as their base.
Note that in this version of the specification, only the contents of metadata can potentially use relative URIs.
A simple PLS document for the word "tomato" and its pronunciation.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>tomato</grapheme>
    <phoneme>təmei̥ɾou̥</phoneme>
    <!-- IPA string is: "təmei̥ɾou̥" -->
  </lexeme>
</lexicon>
  <meta> ElementThe <metadata> and
  <meta> elements are
  containers in which information about the document can be placed.
  The <metadata> element
  provides more general and powerful treatment of metadata
  information than <meta> by
  using a metadata schema.
A <meta> element
  associates a string to a declared meta property or declares
  http-equiv content. Either a name or
  http-equiv attribute is REQUIRED. It
  is an error to provide both name and
  http-equiv attributes. A content
  attribute is also REQUIRED. The only <meta> property defined by this
  specification is "seeAlso". It is used to specify a
  resource that might provide additional metadata information about
  the content. This property is modeled on the
  "seeAlso" property from Section
  5.4.1 of "RDF Vocabulary Description Language 1.0: RDF
  Schema" [RDF-SCHEMA]. The
  http-equiv attribute has a special significance when
  documents are retrieved via HTTP. Although the preferred method
  of providing HTTP header information is to use HTTP header
  fields, the http-equiv content MAY be used in
  situations where the PLS document author is unable to configure
  HTTP header fields associated with their document on the origin
  server, for example, cache control information. Note that HTTP
  servers and caches are not required to inspect the contents of
  <meta> in PLS documents
  and thereby override the header values they would send
  otherwise.
The <meta> element is
  an empty element.
This section is modeled after the <meta>
  description in the HTML 4.01 Specification [HTML]. Despite the fact that the name/content
  model is now being replaced by better ways to include metadata,
  see for instance 
  Section 20.6 of XHTML 2.0 [XHTML2],
  and the fact that the http-equiv directive is no
  longer recommended in 
  Section 3.3 of XHTML Media Types [XHTML-MTYPES], the Working Group has
  decided to retain this for compatibility with the other
  specifications of the first version of the Speech Interface
  Framework (VoiceXML, SSML, SRGS, CCXML). Future versions of the
  framework will align with more modern metadata schemes.
This is an example of how <meta> elements can be included in
  a PLS document to specify a resource that provides additional
  metadata information.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
    <meta http-equiv="Cache-Control" content="no-cache"/>
    <meta name="seeAlso" content="http://example.com/my-pls-metadata.xml"/>
    <!--  If lexemes are to be added to this lexicon, they start below -->
</lexicon>
  <metadata> ElementThe <metadata> element
  is a container in which information about the document can be
  placed using metadata markup. The behavior of software processing
  the content of a <metadata> element is not
  described in this specification. Therefore, software implementing
  this specification is free to ignore that content.
Although any metadata markup can be used within <metadata>, it is RECOMMENDED that the RDF/XML Syntax [RDF-XMLSYNTAX] be used, in conjunction with
  the general metadata properties defined by the Dublin Core
  Metadata Initiative [DC] (e.g., Title,
  Creator, Subject, Description, Rights, etc.)
This is an example of how metadata can be included in a PLS document using the "Dublin Core Metadata Element Set, Version 1.1" [DC-ES] describing general document information such as title, description, date, and so on:
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <metadata>
    <rdf:RDF
       xmlns:rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
       xmlns:dc  = "http://purl.org/dc/elements/1.1/">
     <!-- Metadata about the PLS document -->
     <rdf:Description rdf:about=""
       dc:title="Pronunciation lexicon for W3C terms"
       dc:description="Common pronunciations for many W3C acronyms and abbreviations, i.e. I18N or WAI"
       dc:publisher="W3C"
       dc:date="2005-11-29"
       dc:rights="Copyright 2002 W3C"
       dc:format="application/pls+xml">
       <dc:creator>The W3C Voice Browser Working Group</dc:creator>
     </rdf:Description>
    </rdf:RDF>
  </metadata>
  <!--  If lexemes are to be added to this lexicon, they start below -->
</lexicon>
  <lexeme> ElementThe <lexeme> element is
  a container for a lexical entry which MAY include
  multiple orthographies and
  multiple pronunciation information.
The <lexeme> element
  contains one or more <grapheme> elements, one or more
  pronunciations (either by <phoneme> or <alias> elements or a combination
  of both), and zero or more <example> elements. The children
  of the <lexeme> element
  MAY
  appear in any order, but note that the order will have an impact
  on the treatment of multiple pronunciations (see Section 4.9).
The <lexeme> element
  has an OPTIONAL xml:id [XML-ID] attribute, allowing the element to be
  referenced from other documents (through fragment identifiers or
  XPointer [XPOINTER], for instance).
  For example, developers may use external RDF statements [RDF-CONC] to associate metadata (such as part
  of speech or word relationships) with a lexeme.
The <lexeme> element
  has an OPTIONAL role attribute which takes
  as its value one or more white space separated QNames as defined
  in Section 4 of Namespaces in XML (1.0 [XML-NS10] or 1.1 [XML-NS11], depending on the version of XML
  being used).
The role attribute describes additional
  information to help the selection of the most relevant
  pronunciation for a given orthography. The main use is to
  differentiate words that have the same spelling but are
  pronounced in different ways (cf. homographs and see also Section 5.5). A QName in the attribute content of the
  role attribute is expanded into an expanded-name
  using the namespace declarations in scope for the containing
  <lexeme> element. Thus,
  each QName provides a reference to a specific item in the
  designated namespace. In the second example below, the QName
  "claws:VVI" within the role attribute expands to the
  "VVI" item in the
  "http://www.example.com/claws7tags" namespace. This
  mechanism allows for referencing defined taxonomies of word
  classes, with the expectation that they are documented at the
  specified namespace URI.
A pronunciation lexicon for the Italian language with two lexemes. One of them is for the loan word "file" which is often used in technical discussions to have the same meaning and pronunciation as in English. This is distinct from the homograph noun "file" which is the plural form of "fila" meaning "queue". Note that this user-specified pronunciation for "file" takes precedence over any system-defined pronunciation.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="it">
  <lexeme>
    <grapheme>file</grapheme>
    <phoneme>faɪl</phoneme>
    <!-- This is the pronunciation
      of the loan word "file" in Italian.
      IPA string is: "faɪl" -->
  </lexeme>
  <lexeme>
    <grapheme>EU</grapheme>
    <alias>Unione Europea
      <!-- This is a substitution of the European
      Union acronym in Italian language.  --></alias>
  </lexeme>
</lexicon>
  The following is an example of a pronunciation lexicon for the word "read":
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      xmlns:claws="http://www.example.com/claws7tags" alphabet="ipa"
      xml:lang="en">
  <lexeme role="claws:VVI claws:VV0 claws:NN1">
    <!-- verb infinitive, verb present tense, singular noun -->
    <grapheme>read</grapheme>
    <phoneme>riːd<!-- same as riːd --></phoneme>
  </lexeme>
  <lexeme role="claws:VVN claws:VVD">
    <!-- verb past participle, verb past tense -->
    <grapheme>read</grapheme>
    <phoneme>red</phoneme>
  </lexeme>
</lexicon>
  Note that the role attribute is based on
  qualified values (in this example from the UCREL CLAWS7
  tagset of part-of-speech) to distinguish the verb infinitive,
  present tense and singular noun from the verb past tense and past
  participle pronunciation of the word "read".
The following is an example document which references the
  above lexicon and includes an extension element to show how the
  role attribute may be used to select the relevant
  pronunciation of the word "read" in the dialog.
<?xml version="1.0" encoding="UTF-8"?>
<speak version="1.0" 
      xmlns="http://www.w3.org/2001/10/synthesis" 
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
        http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
      xmlns:myssml="http://www.example.com/ssml_extensions"
      xmlns:claws="http://www.example.com/claws7tags"
      xml:lang="en">
  <lexicon http://www.example.com/lexicon.pls"
      type="application/pls+xml"/>
  <voice gender="female" age="3">
      Can you <myssml:token role="claws:VVI">read</myssml:token> this book
      to me?
  </voice>
  <voice gender="male" age="43">
      I've already <myssml:token role="claws:VVN">read</myssml:token> it
      three times!
  </voice>
</speak>
  Here is another example in Chinese that uses SSML 1.1 [SSML-11].
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0"
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      xmlns:claws="http://www.example.com/claws7tags"
      alphabet="x-myorganization-pinyin"
      xml:lang="zh-CN">
  <lexeme role="claws:VV0">
    <!-- base form of lexical verb -->
    <grapheme>处</grapheme>
    <phoneme>chu3</phoneme>
    <!-- pinyin string is: "chǔ" in 处罚 处置 -->
  </lexeme>
  <lexeme role="claws:NN">
    <!-- common noun, neutral for number -->
    <grapheme>处</grapheme>
    <phoneme>chu4</phoneme>
    <!-- pinyin string is: "chù" in 处所 妙处 -->
  </lexeme>
</lexicon>
  This is a sample document which references the above lexicon
  and shows how the role attribute may be used to
  select the relevant pronunciation of the Chinese word "处" in the
  dialog.
<?xml version="1.0" encoding="UTF-8"?>
<speak version="1.1"
      xmlns="http://www.w3.org/2001/10/synthesis"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
        http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
      xmlns:claws="http://www.example.com/claws7tags"
      xml:lang="zh-CN">
  <lexicon uri="http://www.example.com/lexicon.pls"
      type="application/pls+xml"
      xml:id="mylex"/>
  <lookup ref="mylex">
    他这个人很不好相<w role="claws:VV0">处</w>。
    此<w role="claws:NN">处</w>不准照相。
  </lookup>
</speak>  
  The SRGS 1.0 [SRGS] and SSML 1.0
  [SSML] specifications do not currently
  support a selection mechanism based on the role
  attribute. Future versions of these specifications are expected
  to allow the selection of relevant pronunciations on the basis of
  the role attribute.
<grapheme> ElementA <lexeme> contains at
  least one <grapheme>
  element. The <grapheme>
  element contains text describing the orthography of the <lexeme>.
The <grapheme> element
  MUST
  contain 'character' child information items. The <grapheme> element MUST NOT
  contain 'element' child information items from any namespace,
  i.e. PLS or foreign namespace.
In more complex situations there may be alternative textual representations for the same word or phrase; this can arise due to a number of reasons, for example:
In order to remove the need for duplication of pronunciation
  information to cope with the above variations, the <lexeme> element MAY contain more
  than one <grapheme>
  element to define the base orthography and any variants. Note that
  all the pronunciations given within the <lexeme> apply to each and every
  <grapheme> within the
  <lexeme>.
An example of a single grapheme and a single pronunciation.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>Sepulveda</grapheme>
    <phoneme>səˈpʌlvɪdə</phoneme>
    <!-- IPA string is: "səˈpʌlvɪdə" -->
  </lexeme>
</lexicon>
  Another example with more than one written form for a lexical entry, where the first orthography uses Latin characters for "Romaji" orthography, the second one uses "Kanji" orthography and the third one uses the "Hiragana" orthography:
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="jp">
  <lexeme>
    <grapheme>nihongo<!-- "Romaji" --></grapheme>
    <grapheme>日本語<!-- "Kanji" --></grapheme>
    <grapheme>にほんご<!-- "Hiragana" --></grapheme>
    <phoneme>ɲihoŋo
      <!-- IPA string is: "ɲihoŋo" --></phoneme>
  </lexeme>
</lexicon>
  <phoneme> ElementA <lexeme> MAY contain one or
  more <phoneme> elements.
  The <phoneme> element
  contains text describing how the <lexeme> is pronounced.
The <phoneme> element
  MUST
  contain 'character' child information items. The <phoneme> element MUST NOT
  contain 'element' child information items from any namespace,
  i.e. PLS or foreign namespace.
A <phoneme> element
  MAY have
  an alphabet attribute, which indicates the
  pronunciation alphabet that is used for this <phoneme> element only. See
  Section 4.1 for the default pronunciation
  alphabet. The legal values for the alphabet
  attribute are described in Section 2.
The prefer is an OPTIONAL
  attribute, which indicates the pronunciation that MUST be used by a
  speech synthesis engine when it is set to
  "true". See Section 4.9 for
  required behavior when multiple pronunciations have
  prefer set to "true". The possible
  values are: "true" or "false". The
  default value is "false".
The prefer mechanism spans both the <phoneme> and <alias> elements. Section 4.9 describes how multiple pronunciations are
  specified in PLS for ASR and TTS, and gives many examples in Section 4.9.3.
More than one pronunciation per lexical entry:
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>huge</grapheme>
    <phoneme prefer="true">hjuːdʒ</phoneme>
    <!-- IPA string is: "hjuːdʒ" -->
    <phoneme>juːdʒ</phoneme>
    <!-- IPA string is: "juːdʒ" -->
  </lexeme>
</lexicon>
  More than one written form and more than one pronunciation:
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>theater</grapheme>
    <grapheme>theatre</grapheme>
    <phoneme prefer="true">ˈθɪətər</phoneme>
    <!-- IPA string is: "ˈθɪətər" -->
    <phoneme>ˈθiːjətər</phoneme>
    <!-- IPA string is: "ˈθiːjətər" -->
  </lexeme>
</lexicon>
  An example of a <phoneme> that changes the
  pronunciation alphabet to a proprietary one.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>color</grapheme>
    <phoneme>ˈkʌlər</phoneme>
    <!-- IPA string is: "ˈkʌlər" -->
  </lexeme>
  <lexeme>
    <grapheme>XYZ</grapheme>
    <phoneme alphabet="x-example-alphabet">XYZ</phoneme>
    <!-- The above pronunciation is given in a proprietary alphabet 
      called: "x-example-alphabet" -->
  </lexeme>
</lexicon>
  <alias> ElementA <lexeme> element
  MAY
  contain one or more <alias> elements which are used to
  indicate the pronunciation of an acronym or an abbreviated term, in terms of
  other orthographies, or other
  substitutions as necessary; see examples below and in Section 4.9.3.
The <alias> element
  MUST
  contain 'character' child information items. The <alias> element MUST NOT
  contain 'element' child information items from any namespace,
  i.e. PLS or foreign namespace.
In a <lexeme> element,
  both <alias> elements and
  <phoneme> elements
  MAY be
  present. If authors want explicit control over the pronunciation,
  they can use the <phoneme>
  element instead of the <alias> element.
The <alias> element has
  an OPTIONAL prefer attribute analogous
  to the prefer attribute for the <phoneme> element; see Section 4.6 for a normative description of the
  prefer attribute.
Pronunciations of <alias> element contents
  MUST be
  generated by the processor using pronunciations described by the
  <phoneme> element of any
  constituent graphemes in the PLS document and without invoking
  recursive access to the PLS document on the <alias> elements of any
  constituent graphemes. The processor SHOULD
  determine the pronunciations of the remaining <alias> element contents by the
  same process that it determines the pronunciation of
  out-of-lexicon graphemes.
Acronym expansion using the
  <alias> element:
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>W3C</grapheme>
    <alias>World Wide Web Consortium</alias>
  </lexeme>
</lexicon>
  The following example illustrates a combination of <alias> and <phoneme> elements. The indicated
  acronym, "GNU", has only two pronunciations. Note that the
  pronunciation described by the <alias> element of "Unix" is not
  used as part of the pronunciation of the <alias> element contents of "GNU"
  as recursion of <alias> is
  not permissible. The pronunciations described by the <phoneme> elements of "GNU" and
  "Unix" are used by the processor to generate the pronunciation of
  "GNU is Not Unix".
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>GNU</grapheme>
    <alias><!-- be careful about recursion here -->GNU is Not Unix</alias>
    <phoneme>gəˈnuː</phoneme>
    <!-- IPA string is: "gəˈnuː" -->
  </lexeme>
  <lexeme>
    <grapheme>Unix</grapheme>
    <grapheme>UNIX</grapheme>
    <alias>a multiplexed information and computing service</alias>
    <phoneme>ˈjuːnɪks</phoneme>
    <!-- IPA string is: "ˈjuːnɪks" -->
  </lexeme>
</lexicon>
  <example> ElementThe <example> element
  includes an example sentence that illustrates an occurrence of
  this lexeme. Because the examples are
  explicitly marked, automated tools can be used for regression
  testing and for generation of pronunciation lexicon documentation.
The <example> element
  MUST
  contain 'character' child information items. The <example> element MUST NOT
  contain 'element' child information items from any namespace,
  i.e. PLS or foreign namespace.
Zero, one or many <example> elements MAY be provided
  for a single <lexeme>
  element.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>lead</grapheme>
    <phoneme>led</phoneme>
    <example>My feet were as heavy as lead.<!-- possible comment --></example>
  </lexeme>
  <lexeme>
    <grapheme>lead</grapheme>
    <phoneme>liːd</phoneme>
    <!-- IPA string is: "liːd" -->
    <example>The guide once again took the lead.</example>
  </lexeme>
</lexicon>
  This section describes the treatment of multiple pronunciations specified in a PLS document for ASR and TTS.
If more than one pronunciation for a given <lexeme> is specified (either by
  <phoneme> elements or
  <alias> elements or a
  combination of both), an ASR processor
  MUST
  consider each of them as valid pronunciations for the <grapheme>. See Example 2 and following examples in Section 4.9.3.
If more than one <lexeme> contains the same
  <grapheme>, all relevant
  pronunciations (see discussion in Section 4.4
  regarding the selection of relevant pronunciations using the
  role attribute) will be collected in document order
  and an ASR processor MUST consider all
  of them as valid pronunciations for the <grapheme>. See Example 7 and Example 8
  in Section 4.9.3.
If more than one pronunciation for a given <lexeme> is specified (either by
  <phoneme> elements or
  <alias> elements or a
  combination of both), a TTS processor
  MUST
  use the first one in document order that has the
  prefer attribute set to "true". If none
  of the pronunciations has prefer set to
  "true", the TTS processor
  MUST
  use the first one in document order unless the TTS processor is documented as having a method of
  selecting pronunciations, in which case the processor MUST use any one
  of the pronunciations. See Example 2 and
  following examples in Section 4.9.3.
If more than one <lexeme> contains the same
  <grapheme>, all relevant
  pronunciations (see discussion in Section 4.4
  regarding the selection of relevant pronunciations using the
  role attribute) will be collected in document order
  and a TTS processor MUST use the
  first one in document order that has the prefer
  attribute set to "true". If none of the relevant
  pronunciations has prefer set to
  "true", the TTS processor
  MUST
  use the first one in document order unless the TTS processor is documented as having a method of
  selecting pronunciations, in which case the processor MUST use any one
  of the relevant pronunciations. See Example
  7 and Example 8 in Section 4.9.3.
Note that a TTS processor may have language-dependent internal mechanisms enabling it to automatically choose between multiple pronunciations. See Example 9 in Section 4.9.3.
This section is informative.
The following examples are designed to describe and illustrate the most common examples of multiple pronunciations. Both ASR and TTS behavior is described.
In the following example, there is only one pronunciation. It will be used by both ASR and TTS processors.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>bead</grapheme>
    <phoneme>biːd</phoneme>
    <!-- IPA string is: "biːd" -->
  </lexeme>
</lexicon>
  In the following example, there are two pronunciations. An
  ASR processor will recognize both
  pronunciations, whereas a TTS processor
  will only use one. Since none of the pronunciations has
  prefer set to "true", unless the
  processor is documented to have a different strategy, it will use
  the first of the pronunciations because it is first in document
  order.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>read</grapheme>
    <phoneme>red</phoneme>
    <phoneme>riːd</phoneme>
    <!-- IPA string is: "riːd" -->
  </lexeme>
</lexicon>
  In the following example, there are two pronunciations. An
  ASR processor will recognize both
  pronunciations, whereas a TTS processor
  will only use the second one (because it has prefer
  set to "true").
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>lead</grapheme>
    <phoneme>led</phoneme>
    <phoneme prefer="true">liːd</phoneme>
    <!-- IPA string is: "liːd" -->
  </lexeme>
</lexicon>
  In the following example, "read" has two pronunciations. The
  first one is specified by means of an alias to "red", which is
  defined just below it. An ASR processor
  will recognize both pronunciations, whereas a TTS processor will only use one. Since none of
  the pronunciations has prefer set to
  "true", unless the processor is documented to have a
  different strategy, it will use the first of the pronunciations
  because it is first in document order. In this example, the alias
  refers to a lexeme later in the lexicon, but in general, this
  order is not relevant.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>read</grapheme>
    <alias>red</alias>
    <phoneme>riːd</phoneme>
    <!-- IPA string is: "riːd" -->
  </lexeme>
  <lexeme>
    <grapheme>red</grapheme>
    <phoneme>red</phoneme>
  </lexeme>
</lexicon>
  In the following example, there are two pronunciations for
  "lead". Both are given with prefer set to
  "true". An ASR processor
  will recognize both pronunciations, whereas a TTS processor will only use the first one
  (because it is first in document order that has
  prefer set to "true").
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>lead</grapheme>
    <alias prefer="true">led</alias>
    <phoneme prefer="true">liːd</phoneme>
    <!-- IPA string is: "liːd" -->
  </lexeme>
  <lexeme>
    <grapheme>led</grapheme>
    <phoneme>led</phoneme>
  </lexeme>
</lexicon>
  In the following example, there are two pronunciations for
  "lead". ASR processor will recognize both
  pronunciations, whereas a TTS processor
  will only use the second one (because it has prefer
  set to "true"). Note that the alias entry for "lead"
  as "led" does not inherit the preference of the pronunciation of
  the alias.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>lead</grapheme>
    <alias>led</alias>
    <phoneme prefer="true">liːd</phoneme>
    <!-- IPA string is: "liːd" -->
  </lexeme>
  <lexeme>
    <grapheme>led</grapheme>
    <phoneme prefer="true">led</phoneme>
  </lexeme>
</lexicon>
  In the following example, "lead" has two different entries in
  the lexicon. An ASR processor will
  recognize both pronunciations given here, but a TTS processor will only recognize one. Since none
  of the pronunciations has prefer set to
  "true", unless the processor is documented to have a
  different strategy, it will use the "led" pronunciation because
  it is first in document order.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>lead</grapheme>
    <phoneme>led</phoneme>
  </lexeme>
  <lexeme>
    <grapheme>lead</grapheme>
    <phoneme>liːd</phoneme>
    <!-- IPA string is: "liːd" -->
  </lexeme>
</lexicon>
  In the following example, there are two pronunciations in each
  of two different lexeme entries in the same lexicon document. An
  ASR processor will recognize both
  pronunciations given here, but a TTS
  processor will only recognize the "liːd" pronunciation, because
  it is the first one in document order that has
  prefer set to "true".
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>lead</grapheme>
    <alias>led</alias>
    <phoneme prefer="true">liːd</phoneme>
    <!-- IPA string is: "liːd" -->
  </lexeme>
  <lexeme>
    <grapheme>lead</grapheme>
    <phoneme prefer="true">led</phoneme>
    <phoneme>liːd</phoneme>
    <!-- IPA string is: "liːd" -->
  </lexeme>
</lexicon>
  In the following example in French, "1" has three
  pronunciations. The latter two pronunciations are specified by
  means of an alias to "une", which is defined just below it. An
  ASR processor will recognize all three
  pronunciations given here, but a TTS
  processor will only recognize the "un" pronunciation, unless
  otherwise documented by the processor. A TTS processor documented capable of automatically
  choosing between multiple pronunciations will select either the
  "un" or "une" alias (given a grammatical context). If it selects
  the "une" alias then the "yn" pronunciation will be used because
  it has prefer set to "true".
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0"
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="fr">
  <lexeme>
    <grapheme>1</grapheme>
    <alias>un</alias>
    <alias>une</alias>
  </lexeme>
  <lexeme>
    <grapheme>une</grapheme>
    <phoneme prefer="true">yn</phoneme>
    <phoneme>ynə</phoneme>
    <!-- IPA string is: "ynə" -->
  </lexeme>
</lexicon>
  This section is informative.
In its simplest form the Pronunciation Lexicon language allows orthographies (the textual representation) to be associated with pronunciations (the phonetic/phonemic representation). A Pronunciation Lexicon document typically contains multiple entries. So, for example, to specify the pronunciation for proper names, such as "Newton" and "Scahill", the markup will look like the following.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-GB">
  <lexeme>
    <grapheme>Newton</grapheme>
    <phoneme>ˈnjuːtən</phoneme>
    <!-- IPA string is: "ˈnjuːtən" -->
  </lexeme>
  <lexeme>
    <grapheme>Scahill</grapheme>
    <phoneme>ˈskɑhɪl</phoneme>
    <!-- IPA string is: "ˈskɑhɪl" -->
  </lexeme>
</lexicon>
  Here we see the root element <lexicon> which contains the two
  lexemes for the words "Newton" and
  "Scahill". Each <lexeme>
  is a composite element consisting of the orthographic and pronunciation
  representations for the entry. For each of the two <lexeme> elements there is a
  single <grapheme> element
  which includes the orthographic
  text and a single <phoneme> element which includes
  the pronunciation. In this case the alphabet
  attribute of the <lexicon>
  element is set to "ipa", so the International Phonetic Alphabet [IPA] is being used for all the pronunciations.
For ASR systems it is common to rely
  on multiple pronunciations of the same word or phrase in order to
  cope with variations of pronunciation within a language. In the
  Pronunciation Lexicon language,
  multiple pronunciations are represented by more than one <phoneme> (or <alias>) element within the same
  <lexeme> element.
In the following example the word "Newton" has two possible pronunciations.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-GB">
  <lexeme>
    <grapheme>Newton</grapheme>
    <phoneme>ˈnjuːtən</phoneme>
    <!-- IPA string is: "ˈnjuːtən" -->
    <phoneme>ˈnuːtən</phoneme>
    <!-- IPA string is: "ˈnuːtən" -->
  </lexeme>
</lexicon>
  In the situation where only a single pronunciation needs to be
  selected among multiple pronunciations that are available (for
  example where a pronunciation
  lexicon is being used by a speech
  synthesis system), then the prefer attribute on
  the <phoneme> element may
  be used to indicate the preferred pronunciation.
In some situations there are alternative textual
  representations for the same word or phrase. This can arise due
  to a number of reasons. See Section 4.5 for
  details. Because these are representations that have the same
  meaning (as opposed to homophones),
  it is recommended that they be represented using a single
  <lexeme> element that
  contains multiple graphemes.
Here are two simple examples of multiple orthographies: alternative spelling of an English word and multiple writings of a Japanese word.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <!-- English entry showing how alternative spellings are handled -->
  <lexeme>
    <grapheme>colour</grapheme>
    <grapheme>color</grapheme>
    <phoneme>ˈkʌlər</phoneme>
    <!-- IPA string is: "ˈkʌlər" -->
  </lexeme>
</lexicon>
  
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="jp">
  <!-- Japanese entry showing how multiple writing systems are handled
          romaji, kanji and hiragana orthographies -->
  <lexeme>
    <grapheme>nihongo</grapheme>
    <grapheme>日本語</grapheme>
    <grapheme>にほんご</grapheme>
    <phoneme>ɲihoŋo</phoneme>
    <!-- IPA string is: "ɲihoŋo" -->
  </lexeme>
</lexicon>
  In some cases the pronunciations may overlap rather than being
  exactly the same. For example the English names "Smyth" and
  "Smith" share one pronunciation, but "Smyth" has a pronunciation
  that is only relevant to itself. Hence this needs to be
  represented using multiple <lexeme> elements.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>Smyth</grapheme>
    <grapheme>Smith</grapheme>
    <phoneme>smɪθ/phoneme>
    <!-- IPA string is: "smɪθ" -->
  </lexeme>
  <lexeme>
    <grapheme>Smyth</grapheme>
    <phoneme>smaɪð</phoneme>
    <!-- IPA string is: "smaɪð" -->
  </lexeme>
</lexicon>
  Most languages have homophones, words with the same pronunciation but different meanings (and possibly different spellings), for instance "seed" and "cede". It is recommended that these be represented as different lexemes.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>cede</grapheme>
    <phoneme>siːd</phoneme>
    <!-- IPA string is: "siːd" -->
  </lexeme>
  <lexeme>
    <grapheme>seed</grapheme>
    <phoneme>siːd</phoneme>
    <!-- IPA string is: "siːd" -->
  </lexeme>
</lexicon>
  Most languages have words with different meanings but the same
  spelling (and sometimes different pronunciations), called
  homographs. For example, in English
  the word bass (fish) and the word bass (in
  music) have identical spellings but different meanings and
  pronunciations. Although it is recommended that these words be
  represented using separate <lexeme> elements that are
  distinguished by different values of the role
  attribute (see Section 4.4), if a pronunciation lexicon author does not
  want to distinguish between the two words they could simply be
  represented as alternative pronunciations within the same
  <lexeme> element. In the
  latter case the TTS processor will not be
  able to distinguish when to apply the first or the second
  transcription.
In this example the pronunciations of the homograph "bass" are shown.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>bass</grapheme>
    <phoneme>bæs</phoneme>
    <!-- IPA string is: bæs -->
    <phoneme>beɪs</phoneme>
    <!-- IPA string is: beɪs -->
  </lexeme>
</lexicon>
  Note that English contains numerous examples of noun-verb pairs that can be treated either as homographs or as alternative pronunciations, depending on author preference. Two examples are the noun/verb "refuse" and the noun/verb "address".
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      xmlns:mypos="http://www.example.com/my_pos_namespace"
      alphabet="ipa" xml:lang="en-US">
  <lexeme role="mypos:verb">
    <grapheme>refuse</grapheme>
    <phoneme>rɪˈfjuːz</phoneme>
    <!-- IPA string is: "rɪˈfjuːz" -->
  </lexeme>
  <lexeme role="mypos:noun">
    <grapheme>refuse</grapheme>
    <phoneme>ˈrefjuːs</phoneme>
    <!-- IPA string is: "ˈrefjuːs" -->
  </lexeme>
</lexicon>
  For some words and phrases pronunciation can be expressed
  quickly and conveniently as a sequence of other orthographies. The developer is not
  required to have linguistic knowledge, but instead makes use of
  the pronunciations that are already expected to be available. To
  express pronunciations using other orthographies the <alias> element may be used.
This feature may be very useful to deal with acronym expansion.
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <!-- 
        Acronym expansion
  -->
  <lexeme>
    <grapheme>W3C</grapheme>
    <alias>World Wide Web Consortium</alias>
  </lexeme>
  <!-- 
        number representation
  -->
  <lexeme>
    <grapheme>101</grapheme>
    <alias>one hundred and one</alias>
  </lexeme>
  <!-- 
        crude pronunciation mechanism
  -->
  <lexeme>
    <grapheme>Thailand</grapheme>
    <alias>tie land</alias>
  </lexeme>
  <!-- 
        crude pronunciation mechanism and acronym expansion
  -->
  <lexeme>
    <grapheme>BBC 1</grapheme>
    <alias>be be sea one</alias>
  </lexeme>
</lexicon>
  The Contributors who provided ideas, comments, feedback and implementation experience to improve this specification. (listed in alphabetical order):
The editor wishes to thank the following W3C groups for their helpful comments: WAI and WAI/PF, I18N and MMI.
This specification was written with the help of the following people (listed in alphabetical order):
This section is normative.
There are two schemas which can be used to validate PLS
  documents.
  The latest version of the schemas are available at:
"http://www.w3.org/TR/pronunciation-lexicon/pls.xsd""http://www.w3.org/TR/pronunciation-lexicon/pls.rng"For stability it is RECOMMENDED that you use the dated URI available at:
"http://www.w3.org/TR/2008/PR-pronunciation-lexicon-20080818/pls.xsd""http://www.w3.org/TR/2008/PR-pronunciation-lexicon-20080818/pls.rng"This section is normative.
The media type associated to Pronunciation Lexicon
  Specification documents is "application/pls+xml" and
  the filename suffix is ".pls" as defined in
  [RFC4267].
This section is informative.
Speech applications that use a PLS document need a mechanism
  enabling them to retrieve appropriate lexical content. In the
  simplest of cases, an application will search the PLS document
  for <grapheme> elements
  with content that exactly matches the input and retrieve all
  corresponding lexemes. In general, however, the retrieval of
  lexical content is not so trivial; it is necessary to define what
  constitutes an exact match and which lexemes are to be retrieved
  when competing matches can apply.
Here is an example of an approach to retrieve appropriate lexical content.
<grapheme>
    element with content "n't".<grapheme>
    element whose content exactly matches the longest possible
    sequence of consecutive tokens. Thus, a lexeme for "they'll"
    should have precedence over a lexeme for "they" given the input
    "they'll'.This outlined approach is designed principally with the needs of English in mind and should be modified to accommodate the particular requirements of other languages.
It is recommended for applications that use a PLS document to describe the approach they adopt in retrieving lexical content.
An application that uses the following PLS document:
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>New York</grapheme>
    <alias>NY</alias>
  </lexeme>
  <lexeme>
    <grapheme>York   City</grapheme>
    <alias>YC</alias>
  </lexeme>
</lexicon>
  should process "New York City" as "NY City" rather than "New YC" if it uses the above approach.
This section is informative.