<spec xmlns:e="http://www.w3.org/1999/XSL/Spec/ElementSyntax" id="spec-top" w3c-doctype="cr">
<header>
<title>XQuery and XPath Full Text 1.0</title>
<w3c-designation>CR-xpath-full-text-10</w3c-designation>
<w3c-doctype>W3C Candidate Recommendation</w3c-doctype>
<pubdate>
 <day>16</day>
 <month>May</month>
 <year>2008</year>
</pubdate>

<publoc>
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/2008/CR-xpath-full-text-10-20080516/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">http://www.w3.org/TR/2008/CR-xpath-full-text-10-20080516/</loc>
</publoc>

<altlocs>
   <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/2008/CR-xpath-full-text-10-20080516/xpath-full-text.xml" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">XML</loc>
</altlocs>

<latestloc>
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/xpath-full-text-10/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">http://www.w3.org/TR/xpath-full-text-10/</loc>
</latestloc>

<prevlocs>
  <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/2006/WD-xquery-full-text-20060501/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest"/>
  <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/2005/WD-xquery-full-text-20051103/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest"/>
  <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/2005/WD-xquery-full-text-20050915/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest"/>
  <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/2005/WD-xquery-full-text-20050404/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest"/>
  <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/2004/WD-xquery-full-text-20040709/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest"/>
</prevlocs>

<authlist>
<author>
	<name>Sihem Amer-Yahia</name>
	<affiliation>AT&amp;T Labs - Research</affiliation>
<!--	<email href="mailto:sihem@research.att.com">sihem@research.att.com</email> -->
</author>
<author>
	<name>Chavdar Botev</name>
	<affiliation>Invited Expert</affiliation>
<!--	<email href="mailto:cbotev@cs.cornell.edu">cbotev@cs.cornell.edu</email> -->
</author>
<author>
	<name>Stephen Buxton</name>
	<affiliation>Mark Logic Corporation</affiliation>
<!--	<email href="mailto:stephen.buxton@marklogic.com">stephen.buxton@marklogic.com</email> -->
</author>
<author>
	<name>Pat Case</name>
	<affiliation>Library of Congress</affiliation>
<!--	<email href="mailto:pcase@crs.loc.gov">pcase@crs.loc.gov</email> -->
</author>
<author>
  <name>Jochen Doerre</name>
  <affiliation>IBM</affiliation>
<!--  <email href="mailto:doerre@de.ibm.com">doerre@de.ibm.com</email> -->
</author>
<author>
	<name>Mary Holstege</name>
	<affiliation>Mark Logic Corporation</affiliation>
<!--	<email href="mailto:mary.holstege@marklogic.com">mary.holstege@marklogic.com</email> -->
</author>
<!-- <author>
	<name>Darin McBeath</name>
	<affiliation>Elsevier</affiliation>
	<email href="mailto:D.McBeath@elsevier.com">D.McBeath@elsevier.com</email>
</author> -->
<author>
	<name>Jim Melton</name>
	<affiliation>Oracle</affiliation>
<!--  <email href="mailto:jim.melton@oracle.com">jim.melton@oracle.com</email> -->
</author>
<author>
	<name>Michael Rys</name>
	<affiliation>Microsoft</affiliation>
<!--	<email href="mailto:mrys@microsoft.com">mrys@microsoft.com</email> -->
</author>
<author>
	<name>Jayavel Shanmugasundaram</name>
	<affiliation>Invited Expert</affiliation>
<!--	<email href="mailto:jai@cs.cornell.edu">jai@cs.cornell.edu</email> -->
</author>
</authlist>

<abstract>
<p>This document defines the syntax and formal semantics of XQuery and XPath Full Text 1.0
which is a language that extends XQuery 1.0 <bibref ref="xquery"/>
and XPath 2.0 <bibref ref="xpath20"/> with full-text search capabilities.</p>
</abstract>



<!--* Common status section for QT specs.
    * Use is currently not required, but it simplifies things.
    * 
    * Revisions:
    * 2007-01-15 : CMSMcQ : made file, to simplify publication of Rec.
    * 2008-02-15 : JimMelton : cloned from MSM's REC-only material
                     to generalize for all stages
    *-->

    <status id="status">

<!-- ************************************************************************** -->
<!-- * All Status sections must start with the standard boilerplate paragraph * -->
<!-- *   This entity is defined in status-entities.dtd                        * -->
<!-- ************************************************************************** -->
      <p><emph>This section describes the status of this
         document at the time of its publication.
         Other documents may supersede this document.
         A list of current W3C publications and the latest
         revision of this technical report can be found in the
         <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">W3C technical reports index</loc>
         at http://www.w3.org/TR/.</emph></p>

<!-- ************************************************************************** -->
<!-- * QT publishes suites of documents, which must be described in the       * -->
<!--     Status section of each document within such a suite.                 * -->
<!-- *   This entity is defined in the host document.                         * -->
<!-- ************************************************************************** -->
      

<!-- ************************************************************************** -->
<!-- * There is a lot of detailed customization based on the document stage   * -->
<!-- *   This entity is defined in the host document.                         * -->
<!-- ************************************************************************** -->
      <p>W3C publishes a <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/2004/02/Process-20040205/tr.html#RecsCR" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">Candidate Recommendation</loc>, as described in the <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/2004/02/Process-20040205/tr.html" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">Process Document</loc>,
to indicate that the document is believed to be stable and to encourage implementation
by the developer community.
The publication of this document constitutes a
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/2004/02/Process-20040205/tr.html#cfi" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">call for implementations</loc>
of this specification. </p>
<p>This document has been jointly developed by the W3C 
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/XML/Query/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">XML Query Working Group</loc> and the W3C <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/Style/XSL/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">XSL Working Group</loc>, each of which is part of the <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/XML/Activity/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">XML Activity</loc>.
It will remain a Candidate Recommendation until at least 15 September 2008.
The Working Groups expect to advance this specification to <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/2004/02/Process-20040205/tr.html#RecsW3C" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">Recommendation</loc> Status.</p>
<p>The <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/XML/Query/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">XML Query Working Group</loc> and <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/Style/XSL/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">XSL Working Group</loc> intend to submit
this document for consideration as a W3C
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/2004/02/Process-20040205/tr.html#RecsPR" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">Proposed Recommendation</loc>
as soon as the following conditions are all met:
</p>
<olist>
<item><p>A test suite is available that tests each identified XQuery and XPath Full Text 1.0 feature,
      both required and optional.</p></item>
<item><p>Minimal Conformance to this specification, as defined in
      <specref ref="id-minimal-conformance"/>, has been demonstrated by at least 
      two distinct implementations, at least one of which uses the XQuery human-readable 
      syntax defined in this specification.</p></item>
<item><p>An XPath Full Text parsing applet that generates XQueryX is available.</p></item>
<item><p>The Working Groups have responded formally to all issues raised during
      the CR period against this document.</p></item>
</olist>
<p>Once the entrance criteria for Proposed Recommendation have been achieved,
the Director will be requested to advance this document to <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/2004/02/Process-20040205/tr.html#RecsPR" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">Proposed Recommendation</loc> status. 
Working closely with the developer community, we expect to show evidence of implementations
by approximately 15 September 2008. </p>

<!-- ************************************************************************** -->
<!-- * CR documents must cite features at risk                                * -->
<!-- *   This entity is defined in the host document.                         * -->
<!-- ************************************************************************** -->
      <p>The 15 <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#id-conform-optional-features" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">optional features</loc> are each individually at risk.
   Optional features for which there are not at least two implementations at the end of the
   Candidate Recommendation period may be removed from this specification.</p>

<!-- ************************************************************************** -->
<!-- * Every Status section must have a customized paragraph                  * -->
<!-- *   This entity is defined completely in the host document.              * -->
<!-- ************************************************************************** -->
      <p>The WG
believes that this document, published on 16 May 2008,
is sufficiently mature and stable for the development
community to begin developing implementation experience
and reporting on that experience. </p>
<p>The WGs particularly solicit feedback regarding how
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#ftthesaurusoption" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">thesauri</loc> are to be used in combination. </p>

<!-- ************************************************************************** -->
<!-- * CR docs should, and PR docs must, have a pointer to an implementation  * -->
<!-- *   report.  We also want to point to the test suite.                    * -->
<!-- *   This entity is defined in the host document.                         * -->
<!-- ************************************************************************** -->
      <p>No implementation report currently exists.
However, a Test Suite for this document is under development.
Implementors are encouraged to run this test suite and report their results.
The Test Suite can be found at <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://dev.w3.org:/cvsweb/2007/xpath-full-text-10-test-suite/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">http://dev.w3.org:/cvsweb/2007/xpath-full-text-10-test-suite/</loc>.</p>

<!-- ************************************************************************** -->
<!-- * The Status section should point to a changelog                         * -->
<!-- *   This entity is defined in the host document.                         * -->
<!-- ************************************************************************** -->
      <p>This document incorporates changes made against the <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/2005/10/Process-20051014/tr.html#last-call" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">Last Call Working Draft</loc> of 18 May 2007.
  Changes to this document since the <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/2005/10/Process-20051014/tr.html#last-call" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">Last Call Working Draft</loc> are detailed in
  <specref ref="id-xqft-changelog"/>.</p>

<!-- ************************************************************************** -->
<!-- * The Status section must tell readers where to send comments            * -->
<!-- *   This entity is defined in status-entities.dtd                        * -->
<!-- ************************************************************************** -->
      <p>Please report errors in this document using W3C's
         <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/Bugs/Public/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">public Bugzilla system</loc>
         (instructions can be found at
         <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/XML/2005/04/qt-bugzilla" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">http://www.w3.org/XML/2005/04/qt-bugzilla</loc>).
         If access to that system is not feasible, you may send your comments
         to the W3C XSLT/XPath/XQuery public comments mailing list,
         <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="mailto:public-qt-comments@w3.org" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">public-qt-comments@w3.org</loc>.
         It will be very helpful if you include the string 
         “[FT]”
         in the subject line of your report, whether made in Bugzilla or in email.
         Please use multiple Bugzilla entries (or, if necessary, multiple email messages)
         if you have more than one comment to make.
         Archives of the comments and responses are available at
         <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://lists.w3.org/Archives/Public/public-qt-comments/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">http://lists.w3.org/Archives/Public/public-qt-comments/</loc>. </p>

<!-- ************************************************************************** -->
<!-- Status sections must state the stability (not stable, or REC) of the document -->
<!-- *   This entity is defined in the host document.                         * -->
<!-- ************************************************************************** -->
      <p>Publication as a Candidate Recommendation
does not imply endorsement by the W3C Membership. 
This is a draft document and may be updated, replaced or obsoleted
by other documents at any time. 
It is inappropriate to cite this document as other than work in progress.</p>

<!-- ************************************************************************** -->
<!-- * Finally, all Status sections must end with the appropriate IPR para    * -->
<!-- *   This entity is defined in status-entities.dtd                        * -->
<!-- ************************************************************************** -->
        <p>This document was produced by groups operating under the
   <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/Consortium/Patent-Policy-20040205/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">5 February 2004
   W3C Patent Policy</loc>.
   W3C maintains a <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/2004/01/pp-impl/18797/status#disclosures" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">public list of any 
   patent disclosures</loc> made in connection with the deliverables of the 
   XML Query Working Group and also maintains a <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/2004/01/pp-impl/19552/status#disclosures" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">public list of any patent 
   disclosures</loc> made in connection with the deliverables of the XSL 
   Working Group; those pages also include instructions for
   disclosing a patent.
   An individual who has actual knowledge of a patent which the individual believes
   contains
   <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/Consortium/Patent-Policy-20040205/#def-essential" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">Essential Claim(s)</loc>
   must disclose the information in accordance with
   <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">section 6 of the W3C Patent Policy</loc>. </p>


    </status>


<langusage>
		<language id="EN">English</language>
		<language id="ebnf">EBNF</language>
</langusage>
	
<revisiondesc>

<p>SA January 2004: First version of document before Feb F2F</p>
<p>SA 26 February 2004: Second version of document before Feb F2F
    meetings.</p>
<p>JM 18 May 2007: Last Call Working Draft</p>

</revisiondesc></header>
<body>
<!-- *********************************************************************
      Section 1. Introduction
     ********************************************************************* -->

<div1 id="introduction">
  <head>Introduction</head>

  <p>This document defines the language and the formal semantics of
  XQuery and XPath Full Text 1.0. This language is designed to meet the requirements
  identified in W3C XQuery and XPath Full Text Requirements
  <bibref ref="xqueryft-requirements"/> and to support the queries in
  the W3C XQuery and XPath Full Text Use Cases <bibref ref="xmlquery-full-text-use-cases"/>. </p> 

  <p>XQuery and XPath Full Text 1.0 extends the syntax and semantics of XQuery 1.0 and
  XPath 2.0. </p>

  <p>Additionally, this document defines an XML syntax for XQuery and XPath Full Text 1.0. 
  The most recent versions of the two XQueryX XML Schemas and the
  XQueryX XSLT stylesheet for XQuery and XPath Full Text 1.0 are available at
  <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/2007/xpath-full-text/xpath-full-text-10-xqueryx.xsd" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">http://www.w3.org/2007/xpath-full-text/xpath-full-text-10-xqueryx.xsd</loc>,
  <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/2007/xpath-full-text/xpath-full-text-10-xqueryx-ftmatchoption-extensions.xsd" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">http://www.w3.org/2007/xpath-full-text/xpath-full-text-10-xqueryx-ftmatchoption-extensions.xsd</loc>,
  and <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/2007/xpath-full-text/xpath-full-text-10-xqueryx.xsl" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">http://www.w3.org/2007/xpath-full-text/xpath-full-text-10-xqueryx.xsl</loc>,
  respectively.</p>

	<div2 id="tq-ftsearch-xml">
		<head>Full-Text Search and XML</head> 

<p>As XML becomes mainstream, users expect to be able to 
search their XML documents. This requires a standard way to do
full-text search, as well as structured searches, against XML
documents.  A similar requirement for full-text search led ISO to
define the <!-- <loc
href="ftp://sqlstandards.org/SC32/WG4/Progression_Documents/CD/cd-fulltext-2001-05.pdf">SQL/MM-FT
standard</loc>.  --> SQL/MM-FT <bibref ref="sqlmm"/> standard.
SQL/MM-FT defines extensions to SQL to express
full-text searches providing functionality similar to that defined in this full-text
language extension to XQuery 1.0 and XPath 2.0.
</p>

<p>XML documents may contain highly structured data (fixed schemas, known types
such as numbers, dates), semi-structured data (flexible schemas and types),
markup data (text with embedded tags), and unstructured data (untagged
free-flowing text). Where a document contains unstructured
or semi-structured data, it is important to be able to search using
Information Retrieval techniques such as scoring and weighting.</p>

<p>Full-text search is different from substring search in many ways:</p>

<olist>
<item><p>A full-text search searches for tokens and phrases
rather than substrings. A substring search for news items that contain
the string "lease" will return a news item that contains "Foobar
Corporation releases the 20.9 version ...". A full-text search for the
token "lease" will not. </p>
</item>

<item><p>There is an expectation that a full-text search will support
language-based searches which substring search cannot. An
example of a language-based search is "find me all the news items that
contain a token with the same linguistic stem as 'mouse'" (finds "mouse"
and "mice"). Another example based on token proximity is "find me all
the news items that contain the tokens 'XML' and
'Query' allowing up to 3 intervening tokens".</p>
</item>

<item>
<p>Full-text search must address the vagaries and nuances of
language. Search results  are often of varying usefulness. When
you search a web site for cameras that cost less than $100, this
is an exact search.  There is a set of cameras that matches this search,
and a set that does not.  Similarly, when you do a string search across
news items for "mouse", there is only 1 expected result set. When you
do a full-text search for all the news items that contain the
token "mouse", you probably expect to find news items containing the token
"mice", and possibly "rodents", or possibly "computers".  Not
all results are equal. Some results are more "mousey" than others.
Because full-text search may be inexact, we have the notion of score
or relevance. We generally expect to see the most relevant results at
the top of the results list.</p>
</item>
</olist>

<note><p>As XQuery and XPath evolve, they
 may apply the notion of
score to querying structured data. For example, when making travel
plans or shopping for cameras, it is sometimes useful to get an
ordered list of near matches in addition to exact matches. If
 XQuery and XPath define a generalized 
inexact match, we expect XQuery and XPath to utilize the scoring
framework provided by XQuery and XPath Full Text.
</p></note>

<p><termdef id="Full-TextQueriesDef" term="Full-TextQueries"><term>Full-text queries</term> are 
   performed on tokens and phrases. Tokens and phrases are produced via
   tokenization.</termdef> Informally, tokenization breaks a character string into a 
    sequence of tokens, units of punctuation, and spaces.</p>
    
 <p>
Tokenization, in general terms, is the process of converting a text
string into smaller units that are used in query processing. Those
units, called tokens, are the most basic text units that a full-text
search can refer to. Full-text operators typically work on sequences
of tokens found in the target text of a
search. These tokens are characterized by
integers that capture the relative position(s) of the token inside the
string, the relative position(s) of the sentence containing the token,
and the relative position(s) of the paragraph containing the token.  The
positions typically comprise a start and an end position.</p>

<p>
Tokenization, including the definition of the term "tokens", <termref def="should">SHOULD</termref> be
<termref def="dt-implementation-defined">implementation-defined</termref>. 
Implementations <termref def="should">SHOULD</termref> expose the rules and sample
results of tokenization as much as possible to enable users to predict and
interpret the results of tokenization. Tokenization is defined more formally in
<specref ref="TokenizationSec"/>.
</p>
     

<p>
<termdef id="TokenDef" term="Token">A <term>token</term> is a non-empty sequence of characters returned by a tokenizer as a basic unit to be
searched. Beyond that, tokens are <termref def="dt-implementation-defined">implementation-defined</termref>.</termdef>
<termdef id="PhraseDef" term="Phrase">A <term>phrase</term> is an ordered sequence of any number of tokens. Beyond that,
 phrases are <termref def="dt-implementation-defined">implementation-defined</termref>.</termdef>
 </p> 

<note><p>Consecutive tokens need not be separated by either punctuation or
space, and tokens may overlap.</p></note>

<note><p>In some natural languages, tokens and words can be used
interchangeably.</p></note>

<!--<note><p>Tokens are distinct if they are at different positions in the tokenization, regardless of whether they comprise the same sequence of characters.</p></note>-->

<!--<p>Tokenization also uniquely identifies sentences and paragraphs in which tokens appear.</p>-->
 
<p><termdef id="SentenceDef" term="Sentence">A <term>sentence</term> is an ordered sequence
of any number of tokens. 
Beyond that, sentences are <termref def="dt-implementation-defined">implementation-defined</termref>. 
A tokenizer is not required to support sentences.</termdef></p>

<p><termdef id="ParagraphDef" term="Paragraph">A <term>paragraph</term> is an ordered sequence
of any number of tokens. 
Beyond that, paragraphs are <termref def="dt-implementation-defined">implementation-defined</termref>. 
A tokenizer is not required to support paragraphs.</termdef></p>


<p>
Some XML elements represent semantic
markup, e.g., &lt;title&gt;. Others represent formatting markup, e.g.,
&lt;b&gt; to indicate bold.  Semantic markup serves well as token
boundaries. Some formatting markup serves
well as token boundaries, for example, paragraphs are most commonly delimited
by formatting markup. Other formatting markup may not serve well as token
boundaries. Implementations
are free to provide <termref def="dt-implementation-defined">implementation-defined</termref> ways to differentiate between 
the markup's effect on token boundaries during tokenization. In the absence of an implementation-defined way to differentiate, element markup (start tags, end tags, and empty-element tags)  creates token boundaries.
</p>

<p>
A sample tokenization is used for the examples in this document. 
The results might be different for other tokenizations. </p>   

<p>Tokenization enables functions and operators that operate on a
part or the root of the token (e.g., wildcards, stemming). </p>

<p>Tokenization enables functions and operators which work with the
relative positions of tokens (e.g., proximity operators). </p>

<p>
This specification focuses on functionality that serves all
languages. It also selectively includes functionalities useful within
specific families of languages. For example, searching within
sentences and paragraphs is useful to many western languages and to
some non-western languages, so that functionality is incorporated into
this specification.
</p>


		<p>Certain aspects of language
		processing are described in this specification as
		<term>implementation-defined</term> or
		<term>implementation-dependent</term>.</p>

<ulist>
  <item>
    <p><termdef id="dt-implementation-defined" term="implementation defined"><term>Implementation-defined</term>
		indicates an aspect that may differ between
		implementations, but must be specified by the
		implementor for each particular
		implementation.</termdef></p>
  </item>
  <item>
    <p>
      <termdef id="dt-implementation-dependent" term="implementation   dependent"><term>Implementation-dependent</term>
		indicates an aspect that may differ between
		implementations, is not specified by this or any W3C
		specification, and is not required to be specified by
		the implementor for any particular
		implementation.</termdef></p>
  </item>
</ulist>


</div2>

<div2 id="tq-ft-organization">
 <head>Organization of this document</head> 

<p>This document is organized as follows. We first present a <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#tq-extensions" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">high level syntax</loc> for the XQuery and XPath Full Text 1.0
language along with some examples. Then, we present the <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#ftselections" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">syntax and examples</loc> of the
basic primitives in the XQuery and XPath Full Text 1.0 language. This is followed by the
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#tq-semantics" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">semantics</loc> of the XQuery and XPath Full Text 1.0
language. The appendix contains a section that provides an <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#id-xpath-grammar" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">EBNF for the XPath 2.0 Grammar with Full-Text
Extensions</loc>, an <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#id-grammar" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">EBNF for XQuery 1.0
Grammar with Full-Text Extensions</loc>, <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#ft-acknowledgements" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">acknowledgements</loc> and a <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#ft-glossary" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">glossary</loc>.</p>

	</div2>

<div2 id="tq-ft-namespaces">
  <head>A word about namespaces</head>

<p>Certain namespace prefixes are predeclared by XQuery 1.0 and, by implication, by this specification,
and bound to fixed namespace URIs. These namespace prefixes are as follows:
</p>

<ulist>

<item>
<p>
<code>xml = http://www.w3.org/XML/1998/namespace</code>
</p>
</item>

<item>
<p>
<code>xs = http://www.w3.org/2001/XMLSchema</code>
</p>
</item>

<item>
<p>
<code>xsi = http://www.w3.org/2001/XMLSchema-instance</code>
</p>
</item>

<item>
<p>
<code>fn = http://www.w3.org/2005/xpath-functions</code>
</p>
</item>

<item>
<p>
<code>local = http://www.w3.org/2005/xquery-local-functions</code>
</p>
</item>

</ulist>

<p>
In addition to the prefixes in the above list, this document uses the prefix
<code>err</code> to represent the namespace URI <code>http://www.w3.org/2005/xqt-errors</code>, 
This namespace prefix is not predeclared and its use in this document is not normative. 
Error codes that are not defined in this document are defined in other XQuery 1.0 and XPath 2.0
specifications, particularly <bibref ref="xpath20"/> and <bibref ref="xpath-functions"/>. 
</p>

<p>
Finally, this document uses the prefix <code>fts</code> to represent a namespace
containing a number of functions used in this document to describe the semantics
of XQuery and XPath Full Text functions. There is no
requirement that these functions be implemented, therefore no URI is associated with that prefix. 
</p>

</div2>
  
</div1>
<!-- web35710.mail.mud.yahoo.com compressed/chunked Thu Sep 13 09:00:56 PDT 2007 -->


<!--
2. TeXQuery Expressions
2.1   Processing Model
2.2   FTContainsExpr 
2.3   Scoring
2.3   Extensions to the Static Context

-->
<div1 id="tq-extensions">
   <head>Full-Text Extensions to XQuery and XPath</head>

<p>XQuery and XPath Full Text extends the languages of XQuery
1.0 and XPath 2.0 in three ways. It:</p> 

<olist>
  <item><p>Adds a new expression called FTContainsExpr;</p>
  </item>

  <item><p>Enhances the syntax of FLWOR expressions in XQuery 1.0 and
  <code>for</code> expressions in XPath 2.0 with optional score
  variables; and</p>
  </item>

  <item><p>Adds static context declarations for full-text match
  options to the query prolog.</p>
  </item>
</olist>

<p>Additionally, it extends the data model and processing models in
various ways.</p>

<div2 id="processing-model">
<head>Processing Model</head>


<p>
A <termref def="dt-ftcontains">full-text contains expression</termref>
(<specref ref="section-ftcontainsexpr"/>)
is composed of
several parts:</p>

<olist>

  <item><p>
  An XPath 2.0 or XQuery 1.0 expression (RangeExpr) that
  specifies the sequence of items to be searched. 
  <termdef id="dt-search-context" term="search context">
  Those items are called
  the <term>search context</term>.</termdef></p>
  </item>

  <item><p>
  The full-text selection to be applied (<specref ref="ftselections"/>).
  <term>Full-text selections</term> 
  are, syntactically and semantically, fully composable and contain:
  </p>
  <ulist>

    <item><p>
    Required:</p>

    <ulist>

      <item><p>
      Tokens and phrases for which a search is performed (<specref ref="ftwords"/>).</p>
      </item>

    </ulist>

    </item>

    <item><p>
    Optional:</p>

    <ulist>

      <item><p>
      Match options, such as indicators for case sensitivity and stop
      words (<specref ref="ftmatchoptions"/>);</p>
      </item>

      <item><p>
      Boolean full-text operators, that compose a full-text selection from
      simpler full-text selections (<specref ref="logical_ftoperators"/>);</p>
      </item>

      <item><p>
      Other full-text operators that are constraints on the positions of
      matches, such as indicators for distance between tokens and for the
      cardinality of matches (<specref ref="ftposfilter"/> and 
      <specref ref="fttimes"/>); and</p>
      </item>

      <item><p>
      The weighting information. Each individual search term in a
      full-text selection may be annotated with optional weight
      information. This information may be used during the evaluation
      of the full-text selections to
      calculate scoring, information that quantifies the relevance of the
      result to the given search criteria.</p>
      </item>

    </ulist>

    </item>

  </ulist>

  </item>

  <item><p>
  An optional XPath 2.0 or XQuery 1.0 expression (UnionExpr) that
  specifies the set of nodes, descendents of the RangeExp, whose
  contents must be ignored for the purpose of determining a match
  during the search (<specref ref="ftignoreoption"/>).</p>
  </item>

</olist>

<p>
The results of the evaluation of the full-text selection operators are
instances of the AllMatches model, which complements the XQuery Data
Model (XDM) for processing full-text queries. An AllMatches instance
describes all possible solutions to the full-text query for a given
search context item. Each solution is described by a Match instance. A
Match instance contains the tokens from the search context that must
be included (described using StringInclude instances which model the
positive terms) and the tokens from search context item that must be
excluded (described using StringExclude instances which model the
negative terms). Each negative or positive term is modeled as a tuple:
the position of the query token or phrase in the full-text selection, and a
TokenInfo structure that describes a set of tokens in the text string which match the query token or phrase.
</p>

<graphic xmlns:xlink="http://www.w3.org/1999/xlink" source="images/ProcMod-XQueryFT.gif" alt="Processing Model Extensions" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad"/>

<p>Figure 1 provides a schematic overview of the XQuery and XPath Full Text processing steps that are discussed in detail below. 
Some of these steps are completely outside the domain of XQuery; in
Figure 1, these are depicted outside the black line that represents
the boundaries of the language. The diagram only shows the central pieces
of the XQuery Processing Model (see <xspecref spec="XQ" ref="id-processing-model"/>), however zooms in on the Execution Engine
where the processing of the full-text extensions takes place. The
full-text processing steps are labeled as FTn within the diagram and
are referenced within the text.</p>

<p>
Like all XQuery expressions, an FTContainsExpr returns an XDM
Instance (see Fig. 1). With the exception of FTWords, which consumes TokenInfos,
all full-text selections are closed under the AllMatches data model,
i.e., their input and output are AllMatches instances. Tokenization transforms an XDM
instance into TokenInfos, which ultimately get converted into AllMatches
instances by the evaluation of full-text selections. Thus, the evaluation of
nested full-text and XQuery expressions instances moves back and forth
between these two models.
</p>

<p>
The resulting AllMatches instance obtained by the evaluation of an FTContainsExpr 
is converted into a Boolean value before being
returned to the enclosing XPath or XQuery operation as follows. If at
least one member of the disjunction contains only positive terms then
value returned is true. If all members of the disjunction contain
negative terms the result is false.
</p>

<p>
Weighting information, in an <termref def="dt-implementation-dependent">implementation-dependent</termref> fashion, may be
used when calculating the scoring information computed and made
available by FTContainsExpr to the optional score construct.
</p>

<p>
Given the components of a given full-text contains expression, the evaluation
algorithm will proceed according to the following steps, also referenced in the processing model diagram as steps FT<emph>n</emph> (see Fig. 1):
</p>

<olist>

  <item><p>
  Evaluate the search context expression
  (resulting in the sequence of search context items),
  the ignore option, if any
  (resulting in the set of ignored nodes),
  and any other XQuery/XPath exprssions nested within the full-text contains expression.
  (FT1)
  </p>
  </item>

  <item><p>
  Tokenize the query string(s). (FT2.1)</p>
  </item>

  <item><p>
  For each search context item:</p>

  <olist>

   <item><p>
  Delete the ignored nodes from the search context item.</p>
  </item>
 
    <item><p>
    Tokenize the result of the previous step.
    This produces a sequence of tokens. (FT2.2)
    Note that implementations may (as an optimization) perform tokenization
    as part of the External Processing that is described in the XQuery Processing Model,
    when an XML document is parsed into an Infoset/PSVI
    and ultimately into a XQuery Data Model instance.</p> 
    </item>

    <item><p>
    Evaluate the FTSelection against the tokens of the search context. (FT3, FT4)</p>
    </item>

  </olist>

  </item>

  <item>
  
  <p>
  Convert the topmost AllMatches instances into a Boolean value. (FT5)</p>

<!-- Bugzilla Bug# 3908 -->
  <p>
  The additional scoring information (also part of FT5) that is produced
  by the evaluation 
  of the full-text contains expression is <termref def="dt-implementation-dependent">implementation-dependent</termref> and is not
  specified in this document. The scoring information is made available at the same time the
  Boolean value is returned.
  </p>

  </item>

</olist>

<p>
(A more detailed version of the above procedure
appears in Section <specref ref="FTContainsSec"/>.)
</p>

<!-- Bugzilla Bug# 3908 -->
<p>
Section <specref ref="ftselections"/>
describes the syntax and the informal semantics of full-text operators. 
Their formal semantics as well as the formal definition of the
AllMatches data model are given in Section <specref ref="tq-semantics"/>.
</p>

</div2>


<div2 id="section-ftcontainsexpr">
   <head>Full-Text Contains Expression</head>

<p>
<termdef id="dt-ftcontains" term="full-text contains expression">A
<term>full-text contains expression</term> is a expression that evaluates a
sequence of items against a full-text selection.
</termdef>
</p>
<p>As a syntactic construct, a full-text contains expression
(grammar symbol: <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTContainsExpr" xlink:type="simple">FTContainsExpr</nt>) 
behaves like a
comparison expression (see <xspecref spec="XQ" ref="id-general-comparisons"/>).
This grammar rule introduces <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTContainsExpr" xlink:type="simple">FTContainsExpr</nt>.</p>


<scrap headstyle="show">
<head/>
<prod num="50" id="noid_N10504.doc-xquery-ComparisonExpr"><lhs>ComparisonExpr</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTContainsExpr" xlink:type="simple">FTContainsExpr</nt> ( (<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-ValueComp" xlink:type="simple">ValueComp</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-GeneralComp" xlink:type="simple">GeneralComp</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-NodeComp" xlink:type="simple">NodeComp</nt>)  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTContainsExpr" xlink:type="simple">FTContainsExpr</nt> )?</rhs></prod>
</scrap>

<p>A full-text contains expression may be used anywhere a
ComparisonExpr may be 
used. The <code>ftcontains</code> operator has higher precedence than
other comparison operators,  so the results of <code>ftcontains</code>
expressions may be compared without enclosing them in parentheses.</p>

<div3 id="section-ftcontainsexpr-description">
      <head>Description</head>


<scrap headstyle="show">
<head/>
<prod num="51" id="doc-xquery-FTContainsExpr"><lhs>FTContainsExpr</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-RangeExpr" xlink:type="simple">RangeExpr</nt> ( "ftcontains"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTSelection" xlink:type="simple">FTSelection</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTIgnoreOption" xlink:type="simple">FTIgnoreOption</nt>? )?</rhs></prod>
</scrap>

<p>A full-text contains expression returns a Boolean
value. It returns true if there is some item returned by
the RangeExpr that, after 
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#TokenizationSec" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">tokenization</loc>, 
matches the full-text selection <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTSelection" xlink:type="simple">FTSelection</nt>. See Section
<specref ref="ftselections"/> for more details. 
For the purpose of determining
a match, certain descendants of nodes (identified by 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTIgnoreOption" xlink:type="simple">FTIgnoreOption</nt>) in 
the RangeExpr may be ignored, as specified in Section
<specref ref="ftignoreoption"/>.</p>

<p>An XQuery and XPath Full Text processor <termref def="should">SHOULD</termref> try to use the
information available in xml:lang for processing of collations, as well as
the various match options defined in Section <specref ref="ftmatchoptions"/>. 
</p>

</div3>

<div3 id="section-ftcontainsexpr-examples">
   <head>Examples</head>

<p>The following example in XQuery Full Text returns the author of
each book with a title containing a token with the same root as
<code>dog</code> and the token
<code>cat</code>.

		<eg role="xquery" xml:space="preserve">
for $b in /books/book
where $b/title ftcontains ("dog" with stemming) ftand "cat" 
return $b/author</eg>
</p>
		<p>The same example in XPath Full Text is written as:

		<eg role="xpath" xml:space="preserve">

/books/book[title ftcontains ("dog" with stemming) ftand "cat"]/author</eg>
</p>
<p>In the next example a ComparisonExpr is combined with an FTContainsExpr 
using the logical XQuery operator <code>and</code>. The query
selects books that have a price of less than 50 and a title which contains 
a token with the same root as <code>train</code>:</p>
<eg role="xquery" xml:space="preserve">
/books/book[price &lt; 50 and title ftcontains ("train" with stemming)]
</eg>
<p>The following example shows the combination of two <code>ftcontains</code>
expressions the results of which are compared using the not-equals operator. 
The query
selects books where either the title contains the token
<code>dog</code> and the token <code>cat</code> and the content
does not contain a token with the same root as <code>train</code>, or where the
title fails to have one of the matching tokens but the content does:</p>
<eg role="xquery" xml:space="preserve">
/books/book[title ftcontains "dog" ftand "cat" ne
            content ftcontains ("train" with stemming)]
</eg>
   </div3>

	</div2>

	<div2 id="section-score-variables">
	<head>Score Variables</head>
	<p>Besides specifying a match of a full-text 
        query as a Boolean condition, full-text query applications
        typically also have the ability to associate scores with
        the results. <termdef id="Score" term="Score">The <term>score</term> of a full-text query result expresses its relevance to
        the search conditions.</termdef></p>

        <p>XQuery and XPath Full Text extends the languages of
        XQuery 1.0 and XPath 2.0 further  by adding optional 
        <code>score</code> variables to the <code>for</code> and
        <code>let</code> clauses of FLWOR expressions.</p>

        <p>The production for the extended <code>for</code> clause in XQuery 1.0 follows.


<scrap headstyle="show">
<head/>
<prod num="35" id="doc-xquery-ForClause"><lhs>ForClause</lhs><rhs>"for"  "$"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-VarName" xlink:type="simple">VarName</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-TypeDeclaration" xlink:type="simple">TypeDeclaration</nt>?  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-PositionalVar" xlink:type="simple">PositionalVar</nt>?  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTScoreVar" xlink:type="simple">FTScoreVar</nt>?  "in"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-ExprSingle" xlink:type="simple">ExprSingle</nt>  (","  "$"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-VarName" xlink:type="simple">VarName</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-TypeDeclaration" xlink:type="simple">TypeDeclaration</nt>?  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-PositionalVar" xlink:type="simple">PositionalVar</nt>?  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTScoreVar" xlink:type="simple">FTScoreVar</nt>?  "in"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-ExprSingle" xlink:type="simple">ExprSingle</nt>)*</rhs></prod>
<prod num="37" id="doc-xquery-FTScoreVar"><lhs>FTScoreVar</lhs><rhs>"score"  "$"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-VarName" xlink:type="simple">VarName</nt></rhs></prod>
</scrap> 
</p>

<p>In XPath 2.0, the SimpleForClause is extended similarly.</p>

<p>When a <code>score</code> variable is present in a <code>for</code> 
clause the evaluation of the expression following the <code>in</code>
keyword not only needs to determine the result sequence of the
expression, i.e., the sequence of items which are iteratively
bound to the <code>for</code> variable. It must also determine in each
iteration the relevance "score" value of the current item
and bind the <code>score</code> variable to that value. </p> 

<p>The semantics of scoring and how it relates to second-order functions is 
discussed in Section <specref ref="ScoreSec"/>.</p>

<p>In the following example <code>book</code> elements are determined that satisfy
the condition <code>[content ftcontains "web site" ftand "usability" and
.//chapter/title ftcontains "testing"]</code>. The scores assigned to the
<code>book</code> elements are returned.

		<eg role="xquery" xml:space="preserve">
for $b score $s 
    in /books/book[content ftcontains "web site" ftand "usability" 
                   and .//chapter/title ftcontains "testing"]
return $s
</eg>
</p>

<p>The example above is
also a legal example of the XPath 2.0 extension.</p>

<p>Scores are typically used to order results, as in the 
following, more complete example.
		<eg role="xquery" xml:space="preserve">
for $b score $s 
    in /books/book[content ftcontains "web site" ftand "usability"]
where $s &gt; 0.5
order by $s descending
return &lt;result&gt;  
          &lt;title&gt; {$b//title} &lt;/title&gt; 
          &lt;score&gt; {$s} &lt;/score&gt; 
       &lt;/result&gt;
</eg>
</p>

<p>Note that the score variable gets <emph>one</emph> score value for each item
     in the value of the expression after the <code>in</code> keyword,
     regardless of the number of FTContainsExprs in that expression. In the following example, two separate full-text contains expressions are
used to select the matching paragraphs. There is still just one score for each
<code>para</code> returned.  The highest scoring paragraphs will be returned
first:
</p>

<eg role="xquery" xml:space="preserve">
for $p score $s in //book[title ftcontains "software"]/para[. ftcontains "usability"]
     order by $s descending
  return $p
</eg>

<p>The following more elaborate example uses multiple score variables to
return the matching paragraphs ordered so that those from the highest scoring
books precede those from the lowest scoring books, where the highest scoring
paragraphs of each book are returned before the lower scoring paragraphs of
that book:
</p>
<eg role="xquery" xml:space="preserve">
for $b score $score1 in //book[title ftcontains "software"]
    order by $score1 descending
return
    for $p score $score2 in $b/para[. ftcontains "usability"]
       order by $score2 descending
    return $p
</eg>

<p>The <code>score</code> variable is bound to a value which reflects
the relevance of the match criteria in the 
full-text selections to the items returned by the respective RangeExprs. The
calculation of relevance is <termref def="dt-implementation-dependent">implementation-dependent</termref>, but score
evaluation must follow these rules:</p>

<olist>
<item><p>Score values are of type <code>xs:double</code> in the range
[0, 1].</p></item> 
<item><p>For score values greater than 0, a higher score must imply a
higher degree of relevance </p></item>
</olist>

<p>Similarly to their use in a <code>for</code> clause, score variables
may be specified in a <code>let</code> clause. A score variable in a
<code>let</code> clause is also bound to the score of the expression
evaluation, but in the <code>let</code> clause one score is determined
for the complete result. </p>

<p>The production for the extended <code>let</code> clause follows.


<scrap headstyle="show">
<head/>
<prod num="38" id="doc-xquery-LetClause"><lhs>LetClause</lhs><rhs>(("let"  "$"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-VarName" xlink:type="simple">VarName</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-TypeDeclaration" xlink:type="simple">TypeDeclaration</nt>?)  |  ("let"  "score"  "$"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-VarName" xlink:type="simple">VarName</nt>))  ":="  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-ExprSingle" xlink:type="simple">ExprSingle</nt>  (","  (("$"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-VarName" xlink:type="simple">VarName</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-TypeDeclaration" xlink:type="simple">TypeDeclaration</nt>?)  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTScoreVar" xlink:type="simple">FTScoreVar</nt>)  ":="  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-ExprSingle" xlink:type="simple">ExprSingle</nt>)*</rhs></prod>
</scrap> 
</p>

<p>When using the score option in a <code>for</code> clause the
expression following the <code>in</code> keyword has the dual purpose
of filtering, i.e., driving the iteration, and determining the scores. 
It is possible to separately specify expressions for filtering and
scoring by combining a simple <code>for</code> clause with a
<code>let</code> clause that uses scoring. The following is 
an example of this.

		<eg role="xquery" xml:space="preserve">
for $b in /books/book[.//chapter/title ftcontains "testing"]
let score $s := $b/content ftcontains "web site" ftand "usability" 
order by $s descending
return &lt;result score="{$s}"&gt;{$b}&lt;/result&gt;
</eg>
This example returns <code>book</code> elements with chapter titles that contain "testing". Along with the <code>book</code> elements scores are returned. These scores, however, reflect whether the book content contains "web site" and "usability".</p>

<p>Note that it is not a requirement of the score of an 
FTContainsExpr to be 0, if the expression evaluates to false, nor to
be non-zero, if the expression evaluates to true.
Hence, in the example above it is not possible to infer the Boolean
value of the FTContainsExpr in the <code>let</code> clause from the
calculated score of a returned <code>result</code> element. For instance, an
implementation may want to assign a non-zero score to a book that
contained "web site", but not "usability", as this may be
considered more relevant than a book that does not contain "web site" or "usability".
</p>


<p>
The expression ExprSingle associated with the score variable is passed to
the scoring algorithm. The scoring
algorithm calculates the score value based on the passed expression
(not on the value returned by evaluating the expression). The set of expressions supported by the scoring algorithm is <termref def="dt-implementation-defined">implementation-defined</termref>. If an expression not supported by the scoring algorithm is passed to the scoring algorithm, the result is implementation-defined.
</p>

<p>The use of <code>score</code> variables introduces a second-order
aspect to the evaluation of expressions which cannot be emulated by
(first-order) XQuery functions. Consider the following replacement of
the clause <code>let score $s := FTContainsExpr</code></p>

		<eg xml:space="preserve">
let $s := score(FTContainsExpr)
</eg>

<p>where a function <code>score</code> is applied to some
FTContainsExpr. If the function <code>score</code> were first-order, it
would only be applied to the result of the evaluation of 
its argument, which is one of the Boolean constants <code>true</code>
or <code>false</code>. Hence, there would be at most two possible
values such a <code>score</code> function would be able to return and
no further differentiation would be possible. </p>


   <div3 id="section-using-weights">
      <head>Using Weights Within a Scored FTContainsExpr</head>

<p><termdef id="WeightDeclarationsDef" term="WeightDeclarations">Scoring may be influenced by adding <term>weight declarations</term> to search tokens, phrases, and expressions.</termdef> Weight declarations are introduced syntactically in the FTSelection
production, described in Section <specref ref="ftselections"/>.
</p>

<p>The weight <termref def="must">MUST</termref> have an absolute value between 0.0 and 1000.0 inclusive.</p>

<p>The weights assigned are not related to any absolute standard, but typically have a relationship to other weights within the same FTContains expression.</p>

<p>The effect of weights on the resulting score is
<termref def="dt-implementation-dependent">implementation-dependent</termref>. However, scoring algorithms <termref def="must">MUST</termref> conform to 
these constraints:</p> 
<olist>
<item><p>When no explicit weight is specified, the default weight is
1.0; and</p></item>
<item><p>
Weight declarations in an FTContainsExpr for which no scores are
evaluated are ignored. 
</p></item>
</olist>


<p>The following example illustrates how different weights can be used
for different search terms.
		<eg role="xquery" xml:space="preserve">
for $b in /books/book
let score $s := $b/content ftcontains ("web site" weight 0.5)
                                ftand ("usability" weight 2)
return &lt;result score="{$s}"&gt;{$b}&lt;/result&gt;
</eg>
</p>

   </div3>
	</div2>


   <div2 id="section-extensions-static-context">
      <head>Extensions to the Static Context</head>
<p>
The XQuery Static Context is extended with a component for each
full-text <termref def="dt-match-option-group">match option group</termref>.
The settings of these components can be changed
by using the following declaration syntax in the Prolog.
<scrap headstyle="show"><head/>
	<prod num="6" id="doc-xquery-Prolog"><lhs>Prolog</lhs><rhs>((<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-DefaultNamespaceDecl" xlink:type="simple">DefaultNamespaceDecl</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-Setter" xlink:type="simple">Setter</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-NamespaceDecl" xlink:type="simple">NamespaceDecl</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-Import" xlink:type="simple">Import</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTOptionDecl" xlink:type="simple">FTOptionDecl</nt>)  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-Separator" xlink:type="simple">Separator</nt>)*  ((<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-VarDecl" xlink:type="simple">VarDecl</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-FunctionDecl" xlink:type="simple">FunctionDecl</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-OptionDecl" xlink:type="simple">OptionDecl</nt>)  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-Separator" xlink:type="simple">Separator</nt>)*</rhs></prod>
	<prod num="14" id="doc-xquery-FTOptionDecl"><lhs>FTOptionDecl</lhs><rhs>"declare"  "ft-option"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMatchOptions" xlink:type="simple">FTMatchOptions</nt></rhs></prod>
</scrap>
Match options modify the match semantics of full-text
expressions. They are described in detail in  
Section <specref ref="ftmatchoptions"/>. When a match
option is specified explicitly in a full-text expression,
it overrides the setting of the respective component in the
static context.
</p>



   </div2>
</div1>
<!-- web35706.mail.mud.yahoo.com uncompressed/chunked Sun Aug  5 16:53:44 PDT 2007 -->
<!-- web35713.mail.mud.yahoo.com compressed/chunked Thu Sep 13 09:00:24 PDT 2007 -->


<div1 id="ftselections">
	<head>Full-Text Selections</head>

<p>This section describes the
full-text selections which contain the full-text
operators in a <termref def="dt-ftcontains">full-text contains
expression</termref>  
(<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTContainsExpr" xlink:type="simple">FTContainsExpr</nt>), as 
well as the match options which modify the matching semantics of the 
full-text selections. In the following, the syntax for each type of
full-text selection is given together with an informal statement of
its meaning.</p>

<p><termdef id="ftselection" term="full-text selection">A 
<term>full-text selection</term> specifies the conditions of a full-text search.
</termdef></p>

<scrap headstyle="show">
<head/>
<prod num="144" id="doc-xquery-FTSelection"><lhs>FTSelection</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTOr" xlink:type="simple">FTOr</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTPosFilter" xlink:type="simple">FTPosFilter</nt>*  ("weight"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-RangeExpr" xlink:type="simple">RangeExpr</nt>)?</rhs></prod>
</scrap>

<p>As shown in the grammar, a full-text selection consists of search 
conditions possibly involving logical operators (<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTOr" xlink:type="simple">FTOr</nt>) followed by an 
arbitrary number of positional filters (<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTPosFilter" xlink:type="simple">FTPosFilter</nt>)
optionally followed by a "weight" value which is specified using a 
 range expression.
The RangeExpr is evaluated, as if it were an argument to a function 
with an expected type <code>xs:double</code>; it must
be between 0.0 and 1000.0 inclusive.</p>

<p>The syntax and semantics of the individual full-text selection
operators follow.</p>


<p>This XML document
is the source document for examples in this section. </p>

<eg xml:space="preserve">
&lt;books&gt;
  &lt;book number="1"&gt;
    &lt;title shortTitle="Improving Web Site Usability"&gt;Improving  
        the Usability of a Web Site Through Expert Reviews and
        Usability Testing&lt;/title&gt;
    &lt;author&gt;Millicent Marigold&lt;/author&gt;
    &lt;author&gt;Montana Marigold&lt;/author&gt;
    &lt;editor&gt;Véra Tudor-Medina&lt;/editor&gt;
    &lt;content&gt;
      &lt;p&gt;The usability of a Web site is how well the  
          site supports the users in achieving specified  
          goals. A Web site should facilitate learning,  
          and enable efficient and effective task  
          completion, while propagating few errors.
      &lt;/p&gt;
      &lt;note&gt;This book has been approved by the Web Site  
          Users Association.
      &lt;/note&gt;
    &lt;/content&gt;
  &lt;/book&gt;
&lt;/books&gt;
</eg>

<p>Tokenization is <termref def="dt-implementation-defined">implementation-defined</termref>. A sample tokenization is
used for the examples in this section. 
This sample tokenization uses white space, punctuation and XML tags as word-breakers and 
<code>&lt;p&gt;</code> for paragraph boundaries. The results may be different
for other tokenizations.</p>  
 <p>The first five tokens in this example using the sample tokenization would be "Improving", "the", "usability", "of", and "a".</p>

<p>Unless stated otherwise, the results
assume a case-insensitive match.</p>

<div2 id="ftprimary">
	<head>Primary Full-Text Selections</head>

<scrap headstyle="show">
<head/>
<prod num="150" id="doc-xquery-FTPrimary"><lhs>FTPrimary</lhs><rhs>(<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTTimes" xlink:type="simple">FTTimes</nt>?)  |  ("("  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTSelection" xlink:type="simple">FTSelection</nt>  ")")  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTExtensionSelection" xlink:type="simple">FTExtensionSelection</nt></rhs></prod>
</scrap>

<p><termdef id="dt-ftprimary" term="primary full-text selection">A 
<term>primary full-text selection</term> is the basic form of a 
full-text selection. It specifies tokens and phrases as search 
conditions (<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt>), optionally followed by a cardinality constraint 
(<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTTimes" xlink:type="simple">FTTimes</nt>). An <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTSelection" xlink:type="simple">FTSelection</nt> 
in parentheses and the <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTExtensionSelection" xlink:type="simple">FTExtensionSelection</nt>
are also a primary full-text selections.</termdef>
</p>

</div2>


<div2 id="ftwords">
	<head>Search Tokens and Phrases</head>

<scrap headstyle="show">
<head/>
<prod num="151" id="doc-xquery-FTWords"><lhs>FTWords</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWordsValue" xlink:type="simple">FTWordsValue</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt>?</rhs></prod>
<prod num="152" id="doc-xquery-FTWordsValue"><lhs>FTWordsValue</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-Literal" xlink:type="simple">Literal</nt>  |  ("{"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-Expr" xlink:type="simple">Expr</nt>  "}")</rhs></prod> 
<prod num="154" id="doc-xquery-FTAnyallOption"><lhs>FTAnyallOption</lhs><rhs>("any"  "word"?)  |  ("all"  "words"?)  |  "phrase"</rhs></prod>
</scrap>

<p><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt> finds matches that contain the specified 
tokens and phrases.</p>

<p>FTWords consists of two parts: a mandatory <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWordsValue" xlink:type="simple">
FTWordsValue</nt> part and an optional <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">
FTAnyallOption</nt> part. <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWordsValue" xlink:type="simple">FTWordsValue</nt> specifies the tokens and phrases
that must be contained in the matches. <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt> specifies how 
containment is checked. </p>

<p>In general, the tokens and phrases in <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWordsValue" xlink:type="simple">
FTWordsValue</nt> are specified using a nested XQuery expression. 
To simplify notation, the enclosing braces may be omitted if <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWordsValue" xlink:type="simple">FTWordsValue</nt> consists of a single literal.
</p>

<p>The following rules specify how an <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWordsValue" xlink:type="simple">FTWordsValue</nt>
matches tokens and phrases. First, the 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWordsValue" xlink:type="simple">FTWordsValue</nt> is converted to a sequence of
strings as though it were an argument to a function with the expected
type of <code>xs:string*</code>. Then, each of those strings is tokenized into a
sequence of tokens as 
described in <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#TokenizationSec" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">Section 4.1 Tokenization</loc>.
Then, <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt> is checked.</p>

<p>If <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt> is "any", the sequence of tokens for each string is
considered as a phrase, i.e. a match is found in the tokenized form of 
the text being searched, whenever that form contains a subsequence of tokens
that 
corresponds to the sequence of query tokens in an implementation-defined
way and that subsequence of tokens covers consecutive token positions in 
the tokenized text. If the value of the FTWordsValue contains more 
than one string, 
the different strings are considered to be alternatives, i.e.  the resulting 
matches must contain at least one of the generated phrases.</p>

<p>If <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt> is "all", the sequence of tokens for each string is
considered as a phrase. The resulting matches must contain all of the 
generated phrases.</p>

<p>If <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt> is "phrase", the tokens from all the strings are
concatenated in a single sequence, which is considered as a phrase. The
resulting matches must contain the generated phrase.</p>

<p>If <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt> is "any word", the tokens from all the strings are
combined into a single set. The resulting matches must contain at least
one of the tokens in the set.</p>

<p>If <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt> is "all words", the tokens from all the strings are
combined into a single set. The resulting matches must contain all
of the tokens in the set.</p>

<p>If the <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWordsValue" xlink:type="simple">FTWordsValue</nt> evaluates to
a single string, the use of "any", "all", and "phrase" in
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt> produces the same
results.</p>

<p>If <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOptions</nt> is omitted, "any" is 
the default.</p>

<p>The following expression returns the sample <code>book</code> element,
because its <code>title</code>
element contains the token "Expert":</p>
<eg role="xpath" xml:space="preserve">//book[./title ftcontains "Expert"]</eg>

<p>The following expression returns the sample <code>book</code> element,
because its <code>title</code>
element contains the phrase "Expert Reviews":</p>
<eg role="xpath" xml:space="preserve">//book[./title ftcontains "Expert Reviews"]</eg>

<p>The following expression returns the sample <code>book</code> element, 
because its <code>title</code> 
element contains the two tokens "Expert" and "Reviews":</p>
<eg role="xpath" xml:space="preserve">//book[./title ftcontains {"Expert", "Reviews"} all]</eg>

<p>The following expression returns false for our sample document, because 
the <code>p</code> element doesn't
contain the phrase "Web Site Usability" although it contains all of the tokens
in the phrase:</p>
<eg role="xpath" xml:space="preserve">//book//p ftcontains "Web Site Usability"</eg> 

<p>The following expression returns book numbers of <code>book</code> elements by
"Marigold" with a title about "Web Site Usability", sorting them in descending
score order: </p> 
<eg role="xquery" xml:space="preserve">for $book in /books/book[.//author ftcontains "Marigold"] 
let score $score := $book/title ftcontains "Web Site Usability" 
where $score &gt; 0.8 
order by $score descending
return $book/@number</eg> 

</div2>

<div2 id="fttimes">
	<head>Cardinality Selection</head>

<scrap headstyle="show"><head/>
			<prod num="155" id="doc-xquery-FTTimes"><lhs>FTTimes</lhs><rhs>"occurs"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTRange" xlink:type="simple">FTRange</nt>  "times"</rhs></prod>
</scrap>

<p><termdef id="dt-cardinality-selection" term="cardinality selection">A
<term>cardinality selection</term> consist of an 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt> followed
by the <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTTimes" xlink:type="simple">FTTimes</nt> postfix operator.</termdef>
A cardinality selection selects matches for which the operand 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt> is matched a specified number of
times. </p>

<p>A cardinality selection limits the number of different
matches of <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt> within the
specified range. The semantics of FTRange are described in 
<specref ref="ftdistance"/>. </p>

<p>In the document fragment "very very big":</p>

<olist>

<item>
<p>
The <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt> <code>"very big"</code> has 1
match consisting of the second "very" and "big".
</p>
</item>

<item>
<p>
The <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt> <code>{"very", "big"} all</code>
has 2 matches; one consisting of the first "very" and "big", and
the other containing the second "very" and "big".
</p>
</item>

<item>
<p>
The <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt> <code>{"very", "big"} any</code> 
has 3 matches. 
</p>
</item>

</olist>

<p>The following expression returns the example <code>book</code> element's 
number, because the <code>book</code> element contains 2 or more occurrences 
of "usability":</p>

<eg role="xpath" xml:space="preserve">//book[. ftcontains "usability" occurs at least 2 times]/@number</eg>

<p>The following expression returns the empty sequence, because there are 
3 occurrences of <code>{"usability", "testing"} any</code> in the designated 
<code>title</code>:</p>

<eg role="xpath" xml:space="preserve">//book[@number="1" and title ftcontains {"usability", 
"testing"} any occurs at most 2 times] </eg>


</div2>



<div2 id="ftmatchoptions">
	<head>Match Options</head>


<p>Full-text match options modify the matching behaviour of 
the <termref def="dt-ftprimary">primary full-text selection</termref> to which 
they are applied. </p> 

<scrap headstyle="show"><head/>
	<prod num="149" id="doc-xquery-FTPrimaryWithOptions"><lhs>FTPrimaryWithOptions</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTPrimary" xlink:type="simple">FTPrimary</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMatchOptions" xlink:type="simple">FTMatchOptions</nt>?</rhs></prod>
	<prod num="165" id="doc-xquery-FTMatchOptions"><lhs>FTMatchOptions</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMatchOption" xlink:type="simple">FTMatchOption</nt>+</rhs></prod>
	<prod num="166" id="doc-xquery-FTMatchOption"><lhs>FTMatchOption</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTLanguageOption" xlink:type="simple">FTLanguageOption</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWildCardOption" xlink:type="simple">FTWildCardOption</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTThesaurusOption" xlink:type="simple">FTThesaurusOption</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTStemOption" xlink:type="simple">FTStemOption</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTCaseOption" xlink:type="simple">FTCaseOption</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTDiacriticsOption" xlink:type="simple">FTDiacriticsOption</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTStopWordOption" xlink:type="simple">FTStopWordOption</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTExtensionOption" xlink:type="simple">FTExtensionOption</nt></rhs></prod>
</scrap>

<p><termdef id="dt-match-options" term="match option"><term>Match options</term>  modify the set of tokens
      in the query, or how they are matched against tokens in the
      text.</termdef> 
</p>
<p><termdef id="dt-match-option-group" term="match option group">
Each of the seven alternatives of production 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMatchOption" xlink:type="simple">FTMatchOption</nt>
corresponds to one <term>match option group</term>. </termdef>
The match options from any given group are mutually exclusive, i.e., 
only one of these settings can be in effect, whereas match options of
different groups can be combined freely.</p>

<p>
Note that, along with the syntax rules above, there is an extra-grammatical 
constraint,
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#parse-note-multiple-match-options" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">multiple-match-options
      </loc>,
which needs to be considered, if multiple match options are specified.
It states that within a single <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMatchOptions" xlink:type="simple">FTMatchOptions</nt>
at most one match option of any given 
<termref def="dt-match-option-group">match option group</termref> may
be specified. 
For example, if the <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTCaseOption" xlink:type="simple">FTCaseOption</nt> "lowercase" 
is specified, then "uppercase" cannot also be specified as part of the same 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMatchOptions" xlink:type="simple">FTMatchOptions</nt>.
</p>

<p>Although match options only take effect in the application of 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt>, the syntax also allows to specify 
match options that modify the non-primitive full-text selection 
<code>"(" FTSelection ")"</code>. Such a higher-level match option
provides a default for the respective match option group for any
embedded <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTPrimary" xlink:type="simple">FTPrimary</nt>, just as
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTOptionDecl" xlink:type="simple">match option declarations</nt>
in the <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-Prolog" xlink:type="simple">Prolog</nt>
provide default match options for the whole query. 
</p>

<p>
Match options are propagated through the query via the static context.
For each of the seven match option groups,
the static context has a component
that contains one option from that group.
The seven settings are initialized by the implementation
in accordance with the table in 
Appendix <specref ref="id-xqft-static-context-components"/>,
and are modified
by any <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTOptionDecl" xlink:type="simple">FTOptionDecl</nt>s
in the <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-Prolog" xlink:type="simple">Prolog</nt>.
The resulting settings are then propagated unchanged
to every <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTContainsExpr" xlink:type="simple">FTContainsExpr</nt> in the module
(including those in <code>VarDecl</code>s and <code>FunctionDecl</code>s,
and including any that happen to be nested within
another <code>FTContainsExpr</code>).
At any given <code>FTContainsExpr</code>,
the settings from the static context
are copied to the <code>FTContainsExpr</code>'s inner settings,
which are then propagated down the syntax tree.
At each <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTPrimaryWithOptions" xlink:type="simple">FTPrimaryWithOptions</nt>,
the locally specified match options (if any)
overwrite the corresponding inner setting(s).
At each <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt>,
the inner settings are used
as the effective match options
for tokenizing the query strings
and matching them against the tokens in the text.
(These inner settings could be seen
as a parallel set of components in the static context,
but Section <specref ref="tq-semantics"/> models them
as structures that get passed as parameters
to various semantic functions.)
</p>

<p>
Thus, when a match option appears in an <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTSelection" xlink:type="simple">FTSelection</nt>,
it applies to the associated <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTPrimary" xlink:type="simple">FTPrimary</nt>,
but not to any <code>FTContainsExpr</code>s
that happen to be embedded within that <code>FTPrimary</code>.
Instead, for a nested <code>FTContainsExpr</code>,
the default match options are those declared in the <code>Prolog</code>
or, if not declared in the <code>Prolog</code>,
then supplied by the implementation's initial values. 
</p>

<p>
<termdef id="dt-match-option-order" term="match option application order">
The order in which effective match options for an 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt> are applied 
 is called the <term>match option application order</term>.</termdef>
This order is significant
because match options are not always commutative.
For example,
    synonym(stem(word))
is not always the same as
    stem(synonym(word)).
</p>

<p>
The match option application order is subject to some constraints:
<olist>
<item><p>The Language Option must be applied first</p></item>
<item><p>The Stemming Option must be applied before the Case Option and the
Diacritics Option</p></item>
</olist>
Aside from these constraints, the full order of the application of match
options is <termref def="dt-implementation-defined">implementation-defined</termref>.
</p>

<p>
 More information on
their semantics is given in <specref ref="FTMatchOptionsSec"/>.</p>

<p>If no match options declarations are present in the prolog and the
implementation does not define any overwriting of the static context
components for the match options, the query:</p> 

<eg role="xpath" xml:space="preserve">/books/book/title ftcontains "usability" </eg>

<p>is, assuming "de" is the <termref def="dt-implementation-defined">implementation-defined</termref> default language,
equivalent to the query:</p>

<eg role="xpath" xml:space="preserve">/books/book/title ftcontains "usability" 
    language "de"
    without wildcards
    without thesaurus
    without stemming
    case insensitive 
    diacritics insensitive 
    without stop words</eg>


<p> We describe each match option group in more detail in the following
sections.</p>


<div3 id="ftlanguageoption">
	<head>Language Option</head>

<scrap headstyle="show"><head/>
	<prod num="175" id="doc-xquery-FTLanguageOption"><lhs>FTLanguageOption</lhs><rhs>"language"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-StringLiteral" xlink:type="simple">StringLiteral</nt></rhs></prod>
</scrap>

<p><termdef id="dt-ftlanguageoption" term="language option">A 
<term>language option</term> 
modifies token matching by specifying the language of search tokens and 
phrases.</termdef></p>

<p>The StringLiteral following the keyword <code>language</code>
designates one language. It must be castable to <code>xs:language</code>; otherwise, an
error is raised: <xerrorref spec="XP" class="TY" code="0004" type="type"/>. </p> 

<p>The "language" option influences tokenization, stemming, and stop
words in an <termref def="dt-implementation-defined">implementation-defined</termref> way. The "language" option <termref def="may">MAY</termref> influence the behavior of other match options in an <termref def="dt-implementation-defined">implementation-defined</termref> way.</p>

<p>The set of standardized language identifiers is defined in <bibref ref="BCP47"/>.
The set of valid language identifiers among the standardized set is <termref def="dt-implementation-defined">implementation-defined</termref>. 
An implementation <termref def="may">MAY</termref> choose to use private extensions introduced by a
singleton 'x' for additional language identifiers, or other singletons
for registered extensions as described in sec. 2.2.6 of <bibref ref="BCP47"/>.
It is <termref def="dt-implementation-defined">implementation-defined</termref> what additional language identifiers, if any, are valid. 
If an invalid language identifier is specified, then the behavior is <termref def="dt-implementation-defined">implementation-defined</termref>. 
If the implementation chooses to raise an error in that case,
it must raise <errorref class="ST" code="0009"/>.
</p>

<p>The default language is specified in the static context. </p>

<!-- 2007-01-19 Jim: make effect of conflicting languages implementation-defined -->
<p>When an XQuery and XPath Full Text processor evaluates text in a document
that is governed by an xml:lang attribute and
the portion of the full-text query doing that evaluation contains an FTLanguageOption that
specifies a different language from the language specified by the governing xml:lang attribute,
the language-related behavior of that full-text query is <termref def="dt-implementation-defined">implementation-defined</termref>. </p>

<p>This is an example where
the language option is used to select the appropriate stop word list: </p>
<eg role="xpath" xml:space="preserve">//book[@number="1"]//editor ftcontains "salon de the"
with default stop words language "fr"</eg> 

</div3>


<div3 id="ftwildcardoption">
	<head>Wildcard Option</head>

<scrap headstyle="show"><head/>
	<prod num="176" id="doc-xquery-FTWildCardOption"><lhs>FTWildCardOption</lhs><rhs>("with"  "wildcards")  |  ("without"  "wildcards")</rhs></prod>
</scrap>

<p><termdef id="dt-ftwildcardoption" term="wildcard option">A 
<term>wildcard option</term>
modifies token and phrase matching by specifying whether wildcards are used 
or not.</termdef></p>

<p>When the "with wildcards" option is used, wildcard indicators
(represented by periods (.)) and qualifiers may be appended to or
inserted into the query tokens. 
If the period is at the beginning of a query token, the wildcard is a prefix
wildcard. If the period is at the end of a query token, it is a suffix
wildcard. If the period is inserted into a query token, it is an infix
wildcard. 
</p>
<p>
Each indicator and qualifier in a query token
will match zero or more characters within a token in the text being searched,
as described 
below. 
The number of characters matched depends on the qualifier. 
Qualifiers available are none, question mark, asterisk,
plus sign, and two numbers separated by a comma,
both enclosed by curly braces. </p>
 
<olist>

<item> 
<p>If a period is present, but there are no qualifiers, one character in the
text will match.
</p>
</item> 
 
<item> 
<p>If a period is followed by a question mark (.?), zero or one
characters in the text being searched will match. </p>
</item> 

 
<item> 
<p>If a period is followed by an asterisk (.*), zero or more
characters will match.</p>
</item> 

 
<item> 
<p>If a period is followed by a plus sign (.+), one or more characters
will match. </p>
</item> 
 
 
<item> 
<p>If a period is followed by two numbers separated by a comma, both
enclosed by curly braces (.{n,m}), a specified range of characters
(at least n characters and no more than m characters)
will match.</p>
</item> 

</olist>

<p>When "with wildcards" is present and an indicator or qualifier character 
is intended to be taken literally (as itself), that character must be
preceded by ("escaped by") a backslash (\). 
For example, a period (.) that is intended to be a sentence terminator or
a decimal point must be preceded by a backslash so that it is not
interpreted to be an indicator. 
Similarly a question mark (?), asterisk (*), or plus sign (+) that is
intended to be interpreted as an ordinary text character must be preceded by
a backslash so that it is not interpreted to be an indicator. </p>
 
<p>The "without wildcards" option finds tokens without recognizing
wildcard indicators and qualifiers. 
Periods, question marks, asterisks, plus signs, and two numbers
separated by a comma, both enclosed by curly braces,
are always recognized as ordinary text characters.</p>

<p>The default is "without wildcards".</p>

<p>
Note: Wildcard indicators and qualifiers may be token boundaries. How text with
wildcard indicators and qualifiers is tokenized is implementation-defined.
</p>

<p>The expression returns true, because the <code>title</code> element
contains "improving":</p>
<eg role="xpath" xml:space="preserve">//book[@number="1"]/title ftcontains "improv.*" with wildcards</eg>
 
<p>The following expression returns true, because the <code>title</code> element
contains "site":</p>
<eg role="xpath" xml:space="preserve">//book[@number="1"]/title ftcontains ".?site" with wildcards</eg>

<p>The following expression returns true, because the <code>p</code> element
contains "well":</p>
<eg role="xpath" xml:space="preserve">//book[@number="1"]/p ftcontains "w.ll" with wildcards</eg> 

<p>The following expression returns false, because the <code>p</code> element
does not contain the phrase "w ll":</p>
<eg role="xpath" xml:space="preserve">//book[@number="1"]/p ftcontains "w.ll" without wildcards</eg> 
<p>
(Note that, without wildcards, the sample tokenization
will treat the period in "w.ll" as punctuation,
thus producing "w" and "ll" as separate tokens.)
</p>

</div3>


<div3 id="ftthesaurusoption">
	<head>Thesaurus Option</head>

<scrap headstyle="show"><head/>
	<prod num="170" id="doc-xquery-FTThesaurusOption"><lhs>FTThesaurusOption</lhs><rhs>("with"  "thesaurus"  (<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTThesaurusID" xlink:type="simple">FTThesaurusID</nt>  |  "default"))<br/>|  ("with"  "thesaurus"  "("  (<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTThesaurusID" xlink:type="simple">FTThesaurusID</nt>  |  "default")  (","  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTThesaurusID" xlink:type="simple">FTThesaurusID</nt>)*  ")")<br/>|  ("without"  "thesaurus")</rhs></prod>
	<prod num="171" id="doc-xquery-FTThesaurusID"><lhs>FTThesaurusID</lhs><rhs>"at"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-URILiteral" xlink:type="simple">URILiteral</nt>  ("relationship"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-StringLiteral" xlink:type="simple">StringLiteral</nt>)?  (<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTRange" xlink:type="simple">FTRange</nt>  "levels")?</rhs></prod>
	<prod num="143" id="doc-xquery-URILiteral"><lhs>URILiteral</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-StringLiteral" xlink:type="simple">StringLiteral</nt></rhs></prod>
</scrap>

<p><termdef id="dt-ftthesaurusoption" term="thesaurus option">A 
<term>thesaurus option</term>
modifies token and phrase matching by specifying whether a thesaurus is used or
not.</termdef>
 If thesauri are used, the thesaurus option specifies information to locate 
the thesauri either by default or through a URI
reference. It also states the relationship to be applied and how many
levels within the thesaurus to be traversed.</p>

<p>Thesauri add related tokens and phrases to the query or change query tokens.  
Thus, the
user may narrow, broaden, or otherwise modify the query using
synonyms, hypernyms (more generic terms), etc. The search is performed
as though the user has specified all related query tokens and phrases
in a disjunction (FTOr). </p>

<note><p>A thesaurus may be standards-based or locally-defined. It may be a
traditional thesaurus, or a taxonomy, soundex, ontology, or topic
map. How the thesaurus is represented is <termref def="dt-implementation-dependent">implementation-dependent</termref>.</p>
</note> 

<p>FTThesaurusID specifies the relationship sought between tokens and
phrases written in the query and terms in the thesaurus and the number
of levels to be queried in hierarchical relationships by including an
FTRange "levels". If no levels are specified, the default is to query
all levels in hierarchical relationships.</p>

<p>Relationships include, but are not limited to, the relationships
and their abbreviations presented in <bibref ref="iso-2788"/> and
their equivalents in other languages. The set of relationships supported by an
implementation is <termref def="dt-implementation-defined">implementation-defined</termref>, but
implementations <termref def="should">SHOULD</termref> support the relationships
defined in <bibref ref="iso-2788"/>. The following list of terms have the
meanings 
defined in <bibref ref="iso-2788"/>. If a query specifies thesaurus
relationships or levels not supported by the thesaurus, or does not specify a
relationship, 
the behavior is <termref def="dt-implementation-defined">implementation-defined</termref>.
</p>
<olist>
<item><p> <emph>equivalence relationships (synonyms):</emph> PREFERRED TERM (USE), 
NONPREFERRED USED FOR TERM (UF);</p></item>
<item><p> <emph>hierarchical relationships:</emph> BROADER TERM (BT), 
NARROWER TERM (NT),  BROADER TERM GENERIC (BTG), NARROWER TERM GENERIC (NTG), 
BROADER TERM PARTITIVE (BTP), NARROWER TERM PARTITIVE (NTP), 
TOP Terms (TT); and</p></item> 
<item><p> <emph>associative relationships:</emph> RELATED TERM (RT).</p></item>
</olist>

<p>The "with thesaurus" option specifies that string matches include
tokens that can be found in one of the specified thesauri.
When "default" is used in place of a FTThesaurusID, the thesauri
specified in the static context are used, which are either given by the 
prolog declaration for the thesaurus option, or, if no such
declaration exists a system-defined default thesaurus with a 
system-defined relationship. The
default thesaurus may be used in combination with other explicitly
specified thesauri.</p>

<p>The "without thesaurus" option specifies that no thesaurus will be
used. </p>

<p>The default is "without thesaurus". </p>

<p>The following expression returns true, because it finds a <code>content</code>
element containing "tasks" which the thesaurus identified as a synonym for
"duties":</p>

<eg role="xpath" xml:space="preserve">count(.//book/content ftcontains "duties" with
thesaurus at "http://bstore1.example.com/UsabilityThesaurus.xml"
relationship "UF")&gt;0</eg>

<p>The following expression returns <code>book</code> elements, because it finds a
<code>content</code> element containing "web site components", and
narrower terms "navigation" and "layout":</p>

<eg role="xpath" xml:space="preserve">doc("http://bstore1.example.com/full-text.xml")
/books/book[count(./content ftcontains "web site components" with
thesaurus at "http://bstore1.example.com/UsabilityThesaurus.xml"
relationship "NT" at most 2 levels)&gt;0]</eg>

<p>Assuming the thesaurus available at URL 
"http://bstore1.example.com/UsabilitySoundex.xml" 
contains soundex capabilities, the following query
returns a <code>book</code> element containing "Marigold" which
sounds like "Merrygould":</p>

<eg role="xpath" xml:space="preserve">doc("http://bstore1.example.com/full-text.xml")
/books/book[count(. ftcontains "Merrygould" with thesaurus at
"http://bstore1.example.com/UsabilitySoundex.xml" relationship
"sounds like")&gt;0]</eg>

<!--
<p>The following expression returns the true if "Synonyms" is a thesaurus for synonyms 
in the English language:</p>
<eg role="xpath">/books/book[@number="1"]//p ftcontains "buttress" with
thesaurus "Synonyms"</eg>
-->


</div3>

<div3 id="ftstemoption">
	<head>Stemming Option</head>

<scrap headstyle="show"><head/>
			<prod num="169" id="doc-xquery-FTStemOption"><lhs>FTStemOption</lhs><rhs>("with"  "stemming")  |  ("without"  "stemming")</rhs></prod>
</scrap>

<p><termdef id="dt-ftstemoption" term="stemming option">A <term>stemming option</term>
modifies token and
phrase matching by specifying whether stemming is applied or not.
</termdef></p>

<p>The "with stemming" option specifies that matches may contain tokens
that have the same stem as the tokens and phrases written in the
query. It is <termref def="dt-implementation-defined">implementation-defined</termref> what a stem of a token is. </p>

<p>The "without stemming" option specifies that the tokens and
phrases are not stemmed. </p>

<p>It is <termref def="dt-implementation-defined">implementation-defined</termref> whether the stemming is based on an
algorithm, dictionary, or mixed approach. </p>

<p>The default is "without stemming". </p>


<p>The following expression returns true, because the <code>title</code> of the specified
<code>book</code> contains "improving" which has the same stem as
"improve":</p>

<eg role="xpath" xml:space="preserve">/books/book[@number="1"]/title ftcontains "improve" with stemming </eg>


</div3>


<div3 id="ftcaseoption">
	<head>Case Option</head>

<scrap headstyle="show"><head/>
			<prod num="167" id="doc-xquery-FTCaseOption"><lhs>FTCaseOption</lhs><rhs>("case"  "insensitive")<br/>|  ("case"  "sensitive")<br/>|  "lowercase"<br/>|  "uppercase"</rhs></prod>
</scrap>

<p><termdef id="dt-ftcaseoption" term="case option">A <term>case option</term>
modifies the matching of tokens and phrases by specifying how uppercase and 
lowercase characters are considered.</termdef>
</p>


<p>There are four possible character case options:</p>

<olist>
<item><p> Using the option "case insensitive", tokens and phrases are matched,
regardless of the case of characters of the query tokens and phrases.</p></item>

<item><p> Using the option "case sensitive", tokens and phrases are matched,
if and only if the case of their characters is the same as written in the
query.</p></item>

<item><p> Using the option "lowercase", tokens and phrases are matched, if
and only if they match the query without regard to character case, but contain 
only lowercase characters.</p></item>

<item><p> Using the option "uppercase", tokens and phrases are matched, if
and only if they match the query without regard to character case, but contain 
only uppercase characters.</p></item>

</olist>

<p>The default is "case insensitive". </p>

<p>The effect of the case options is also influenced by the query's 
default collation 
(see <xspecref spec="XQ" ref="static_context"/> and
 <xspecref spec="XQ" ref="id-default-collation-declaration"/>).
The following table summarizes how these interact.</p>

<p>
 <table border="1">
      <caption>Case Matrix</caption>
      <thead>
       <tr>
        <th rowspan="1" colspan="1">Case option \ Default collation</th>
        <th rowspan="1" colspan="1">UCC (Unicode Codepoint Collation)</th>
        <th rowspan="1" colspan="1">CCS (some generic case-sensitive collation)</th>
        <th rowspan="1" colspan="1">CCI (some generic case-insensitive collation) </th>
       </tr>
      </thead>
      <tbody>
       <tr>
        <th rowspan="1" colspan="1">case insensitive</th>
        <td rowspan="1" colspan="1">compare as if both lower</td>
        <td rowspan="1" colspan="1">case-insensitive variant of CCS if it exists, else error</td>
        <td rowspan="1" colspan="1">CCI</td>
       </tr>
       <tr>
        <th rowspan="1" colspan="1">case sensitive</th>
        <td rowspan="1" colspan="1">UCC</td>
        <td rowspan="1" colspan="1">CCS</td>
        <td rowspan="1" colspan="1">case-sensitive variant of CCI if it exists, else error</td>
       </tr>
       <tr>
        <th rowspan="1" colspan="1">lowercase</th>
        <td rowspan="1" colspan="1">compare using UCC after applying fn:lower-case() to the query 
           string
        </td>
        <td rowspan="1" colspan="1">compare using CCS after applying fn:lower-case() to the query 
           string</td>
        <td rowspan="1" colspan="1">CCI</td>
       </tr>
       <tr>
        <th rowspan="1" colspan="1">uppercase</th>
        <td rowspan="1" colspan="1">compare using UCC after applying fn:upper-case() to the query 
           string</td>
        <td rowspan="1" colspan="1">compare using CCS after applying fn:upper-case() to the query 
           string</td>
        <td rowspan="1" colspan="1">CCI</td>
       </tr>
      </tbody>
     </table>
</p>

<note><p>In this table, "else error" means "Otherwise, an error
is raised: <xerrorref spec="FO" class="CH" code="0002" type="dynamic"/>". 
The phrase "if it exists" is used, because
the case-sensitive collation CCS does not always have a
case-insensitive variant (and, even if one exists, it may not be
possible to determine it algorithmically), and because the
case-insensitive collation CCI does not always have a case-sensitive
variant (and, even if one exists, it may not be possible to determine
it algorithmically).</p></note>

<p>The following expression returns false, because the <code>title</code> element
doesn't contain "usability" in lower-case characters:</p>

<eg role="xpath" xml:space="preserve">//book[@number="1"]/title ftcontains "Usability" lowercase </eg>

<p>The following expression returns true, because the character case is not
considered:</p>

<eg role="xpath" xml:space="preserve">//book[@number="1"]/title ftcontains "usability" case insensitive</eg>


</div3>

<div3 id="ftdiacriticsoption">
	<head>Diacritics Option</head>

<scrap headstyle="show"><head/>
	<prod num="168" id="doc-xquery-FTDiacriticsOption"><lhs>FTDiacriticsOption</lhs><rhs>("diacritics"  "insensitive")<br/>|  ("diacritics"  "sensitive")</rhs></prod>
</scrap>

<p><termdef id="dt-ftdiacriticsoption" term="diacritics option">A 
<term>diacritics option</term>
modifies token and phrase matching by specifying how diacritics are considered.
</termdef></p>

<p>There are two possible diacritics options:</p>

<olist>
<item><p>The option "diacritics" "insensitive" matches tokens and
phrases with and without diacritics. Whether diacritics are written in
the query or not is not considered.</p></item>

<item><p>The option "diacritics" "sensitive" matches tokens and phrases only
if they contain the diacritics as they are written in the query.</p></item>

</olist>

<p>The default is "diacritics insensitive". </p>

<p>The effect of the diacritics options is also influenced by the query's 
default collation 
(see <xspecref spec="XQ" ref="static_context"/> and
 <xspecref spec="XQ" ref="id-default-collation-declaration"/>).
The following table summarizes how these interact.</p>

<p>
    <table border="1">
      <caption>Diacritics Matrix</caption>
      <thead>
       <tr>
        <th rowspan="1" colspan="1">Diacritics option \ Default collation</th>
        <th rowspan="1" colspan="1">UCC (Unicode Codepoint Collation)</th>
        <th rowspan="1" colspan="1">CDS (some generic diacritics-sensitive collation)</th>
        <th rowspan="1" colspan="1">CDI (some generic diacritics-insensitive collation) </th>
       </tr>
      </thead>
      <tbody>
       <tr>
        <th rowspan="1" colspan="1">diacritics insensitive</th>
        <td rowspan="1" colspan="1">UCC comparison, but without considering diacritics</td>
        <td rowspan="1" colspan="1">diacritics-insensitive variant of CDS
                                  if it exists, else error</td>
        <td rowspan="1" colspan="1">CDI</td>
       </tr>
       <tr>
        <th rowspan="1" colspan="1">diacritics sensitive</th>
        <td rowspan="1" colspan="1">UCC</td>
        <td rowspan="1" colspan="1">CDS</td>
        <td rowspan="1" colspan="1">diacritics-sensitive variant of CDI if it exists, else error</td>
       </tr>
      </tbody>
     </table>
</p>

<note><p>In this table, "else error" means "Otherwise, an error
is raised: <xerrorref spec="FO" class="CH" code="0002" type="dynamic"/>". 
The phrase "if it exists" is used, because
the diacritics-sensitive collation CDS does not always have a
diacritics-insensitive variant (and, even if one exists, it may not be
possible to determine it algorithmically), and because the
diacritics-insensitive collation CDI does not always have a
diacritics-sensitive variant (and, even if one exists, it may not be
possible to determine it algorithmically).</p></note>

<p>The following expression returns true, because the token "Véra" in the
<code>editor</code> element is matched, as the acute accent is not 
considered in the comparison:</p>

<eg role="xpath" xml:space="preserve">//book[@number="1"]//editor ftcontains "Vera" diacritics insensitive</eg>

<p>This returns false, because the <code>editor</code> element does not
contain the token "Vera" in this exact form, i.e. without any diacritics:</p>

<eg role="xpath" xml:space="preserve">//book[@number="1"]/editors ftcontains "Vera" diacritics sensitive</eg>


</div3>
<!--<div3 id="ftspecialcharoption">
	<head>FTSpecialCharOption</head>

<scrap><head></head>
			<prodrecap ref="FTSpecialcharOption"/>
</scrap>

<p><nt def="FTSpecialCharOption">FTSpecialCharOption</nt>
specifies whether special characters such as punctuation should or
should not be ignored. </p>

<p>Influences the way <nt def="FTWords">FTWords</nt> is
applied. </p>

<p>The option "with special characters" specifies that special
characters such as punctuation must also be matched. The option
"without special characters" specifies that special characters such as
punctuation need not be matched.
</p>

<p>The default is "without special characters". </p>


<eg role="xpath">//book[@number="1"]//editor ftcontains "Tudor Medina" with 
special characters </eg> 

<p>returns true.</p>

<eg role="xpath">//book[@number="1"]/editors ftcontains "Tudor-Medina" without
special characters </eg> 

<p>returns false.</p>


</div3>
-->

<div3 id="ftstopwordoption">
	<head>Stop Word Option</head>

<scrap headstyle="show"><head/>
	<prod num="172" id="doc-xquery-FTStopWordOption"><lhs>FTStopWordOption</lhs><rhs>("with"  "stop"  "words"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTStopWords" xlink:type="simple">FTStopWords</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTStopWordsInclExcl" xlink:type="simple">FTStopWordsInclExcl</nt>*)<br/>|  ("without"  "stop"  "words")<br/>|  ("with"  "default"  "stop"  "words"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTStopWordsInclExcl" xlink:type="simple">FTStopWordsInclExcl</nt>*)</rhs></prod>
	<prod num="173" id="doc-xquery-FTStopWords"><lhs>FTStopWords</lhs><rhs>("at"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-URILiteral" xlink:type="simple">URILiteral</nt>)<br/>|  ("("  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-StringLiteral" xlink:type="simple">StringLiteral</nt>  (","  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-StringLiteral" xlink:type="simple">StringLiteral</nt>)*  ")")</rhs></prod>
	<prod num="174" id="doc-xquery-FTStopWordsInclExcl"><lhs>FTStopWordsInclExcl</lhs><rhs>("union"  |  "except")  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTStopWords" xlink:type="simple">FTStopWords</nt></rhs></prod>
</scrap>

<p><termdef id="dt-ftstopwordoption" term="stop word option">A 
<term>stop word option</term>
controls matching of FTWords by specifying whether stop words are used or not. 
Stop words are tokens in the query that match any token in the text being
searched. 
</termdef>
Normally a stop word matches 
exactly one token, but there may be <termref def="dt-implementation-defined">implementation-defined</termref> conditions, under
which a stop word may match a different number of tokens.</p>

<p><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTStopWords" xlink:type="simple">FTStopWords</nt> specifies the list
of stop words either explicitly as a comma-separated list of string
literals, or by the keyword <code>at</code> followed by a literal URI.
If the URI specifies a list of stop words that is not found in the statically
known stop word lists, an error is raised <errorref class="ST" code="0008"/>. 
Whether the stop word
list is resolved from the statically known stop word lists or given explicitly,
no tokenization is performed on the stop words: they are used as they occur  
in the list.
</p>

<p>The "with stop words" option specifies that if a token is within the
specified collection of stop words, it is removed from the search and
any token may be substituted for it. Stop words retain their position
numbers and are counted in <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTDistance" xlink:type="simple">FTDistance</nt>
and <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWindow" xlink:type="simple">FTWindow</nt> searches.</p>

<p>Multiple stop word lists may be combined using "union" or "except".
The keywords "union" and "except" are applied from left to right. If "union" is specified, every string occurring in the lists  
specified by the left-hand side or the right-hand side is a stop 
word. If "except" is specified, only strings occurring in the list  
specified by the left-hand side but not in the list specified
by the right-hand side are stop words. </p>

<p>The "with default stop words" option specifies that an
<termref def="dt-implementation-defined">implementation-defined</termref> collection of stop words is used. </p>

<p>The "without stop words" option specifies that no stop words are
used. This is equivalent to specifying an empty list of stop
words.</p>

<p>The default is "without stop words". </p>

<note>
<p>
Some implementations may apply stop word lists during indexing and be
unable to comply with query-time requests to not apply those stop words. An
implementation may still support stop-word options (and therefore not raise
<errorref class="ST" code="0006"/>)
by applying any additional stop words specified in the query.
Pre-application of irrevocable stop word lists falls under
implementation-defined tokenization behavior in this case, and a query that
specifies "without stop words" may still have some words ignored.
</p>
</note>

<p>The following expression returns true, because the document contains the phrase
"propagating few errors":</p>

<eg role="xpath" xml:space="preserve">/books/book[@number="1"]//p ftcontains "propagation of errors"
with stemming with stop words ("a", "the", "of") </eg>

<p>Note the asymmetry in the stop word semantics: the property of
being a stop word is only relevant to query terms, not to document
terms. Hence, it is irrelevant for the above-mentioned match whether
"few" is a stop word or not, and on the other hand we do not want the
query above to match "propagation" followed by 2 stop words, or even a
sequence of 3 stop words in the document.</p>

<p>The following expression returns false. In this case specifying "few" as 
a stop word has no effect, since "few" does not appear in the query.
Although the words "propagating" and "errors" appear in the text being
searched, the phrase  
"propagating errors" cannot be matched, since that phrase does not occur.</p>

<eg role="xpath" xml:space="preserve">/books/book[@number="1"]//p ftcontains "propagating errors" 
with stop words ("few")</eg> 

<p>The following expression returns false, because "of" is not in the <code>p</code>
element between "propagating" and "errors":</p>

<eg role="xpath" xml:space="preserve">/books/book[@number="1"]//p ftcontains "propagation of errors" 
with stemming without stop words</eg> 

<p>The following expression uses the stop words list specified at the
URL. Assuming that the specified stop word list contains the word
"then", this query
is reduced to a query on the phrase "planning X conducting", allowing any
token as a substitute for X.  It returns a <code>book</code> element,
because its <code>content</code> element contains "planning then
conducting". It would also return the <code>book</code> if the
phrases "planning and conducting" and "planning before conducting"
had been in its <code>content</code>:</p>

<eg role="xpath" xml:space="preserve">
doc("http://bstore1.example.com/full-text.xml")
/books/book[count(.//content ftcontains "planning then 
conducting" with stop words at 
"http://bstore1.example.com/StopWordList.xml")&gt;0]
</eg>

<p>The following expression returns <code>book</code>s containing "planning then
conducting", but not does not return <code>book</code>s containing "planning
and conducting", since it is exempting "then" from being a stop word: </p>

<eg role="xpath" xml:space="preserve">
doc("http://bstore1.example.com/full-text.xml")
/books/book[count(.//content ftcontains "planning then conducting"
with stop words at "http://bstore1.example.com/StopWordList.xml"
except ("the", "then"))&gt;0]
</eg>

</div3>



<div3 id="ftextensionoption">
<head>Extension Option</head>

<p><termdef id="dt-ftextensionoption" term="extension option">An
<term>extension option</term> is a match option that acts in an
<termref def="dt-implementation-defined">implementation-defined</termref> way.
</termdef>
</p>

<scrap headstyle="show">
<head/>
    <prod num="177" id="doc-xquery-FTExtensionOption"><lhs>FTExtensionOption</lhs><rhs>"option"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-QName" xlink:type="simple">QName</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-StringLiteral" xlink:type="simple">StringLiteral</nt></rhs></prod>
</scrap>

<p>An extension option consists of an identifying QName and a StringLiteral.
Typically, a particular option will be recognized by some implementations and
not by others. The syntax is designed so that option declarations can be
successfully parsed by all implementations.
</p> 

<p>The QName of an extension option must resolve to a namespace URI
and local name, using 
the statically known namespaces.</p>

<note><p>There is no default namespace for options.</p></note>

<p>Each implementation recognizes an 
<termref def="dt-implementation-defined">implementation-defined</termref>
set of namespace
URIs used to denote extension options.</p>

<p>If the namespace part of the QName is not a namespace recognized by the
implementation as one used to denote extension option, then the extension option
is ignored.</p>

<p>Otherwise, the effect of the extension option, including its error behavior,
is <termref def="dt-implementation-defined">implementation-defined</termref>.
For example, if the local part of the QName is
not recognized, or if the StringLiteral does not conform to the rules
defined by the implementation for the particular extension option, the implementation may choose
whether to report an error, ignore the extension option, or take some
other action.</p>

<p>Implementations may impose rules on where particular extension options may
appear relative to other match options, and the
interpretation of an option declaration may depend on its position.</p>

<p>An extension option must not be used to change the syntax accepted by the
processor, or to suppress the detection of static errors. However, it may be
used without restriction to modify the set of tokens in the query or how they
are matched against tokens in the text being searched. 
An extension option has the same scope as other match options.
</p>

<p>The following examples illustrate several possible uses for extension
options:</p>
<p>This extension option is set as part of the static context of all 
full-text expressions in the module and might be used to ensure that 
queries are insensitive to Arabic short-vowels.
</p>
<eg role="parse-test" xml:space="preserve">
declare namespace exq = "http://example.org/XQueryImplementation";

declare ft-option option exq:diacritics "short-vowel insensitive"
</eg>
<p>This extension option applies only to the matching in the full-text
selection in which it is found and might be used to specify how compound words
should be matched.
</p>
<eg role="parse-test" xml:space="preserve">
declare namespace exq = "http://example.org/XQueryImplementation";

//para[. ftcontains
         ("Kinder" ftand "Platz" distance exactly 1 words)
         with stemming
	 option exq:compounds "distance=1" ]
</eg>

</div3>
</div2>


<div2 id="logical_ftoperators">
      <head>Logical Full-Text Operators</head>
      
<p>
Full-text selections can be combined with the logical connectives
<code>ftor</code> (full-text or), <code>ftand</code> (full-text and), <code>not in</code> (mild not),
and <code>ftnot</code> (unary full-text not).</p>

<scrap headstyle="show"><head/>
<prod num="145" id="doc-xquery-FTOr"><lhs>FTOr</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnd" xlink:type="simple">FTAnd</nt> ( "ftor"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnd" xlink:type="simple">FTAnd</nt> )*</rhs></prod>
<prod num="146" id="doc-xquery-FTAnd"><lhs>FTAnd</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMildNot" xlink:type="simple">FTMildNot</nt> ( "ftand"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMildNot" xlink:type="simple">FTMildNot</nt> )*</rhs></prod>
<prod num="147" id="doc-xquery-FTMildNot"><lhs>FTMildNot</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTUnaryNot" xlink:type="simple">FTUnaryNot</nt> ( "not"  "in"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTUnaryNot" xlink:type="simple">FTUnaryNot</nt> )*</rhs></prod>
<prod num="148" id="doc-xquery-FTUnaryNot"><lhs>FTUnaryNot</lhs><rhs>("ftnot")? <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTPrimaryWithOptions" xlink:type="simple">FTPrimaryWithOptions</nt></rhs></prod>
</scrap>
 
<div3 id="sec-ftor">
<head>Or-Selection</head>
<p><termdef id="dt-or-selection" term="or-selection">An
<term>or-selection</term> combines two full-text selections using the 
<code>ftor</code> operator.</termdef>
</p>
       
<p>An or-selection finds all matches that satisfy at least
one of the operand full-text selections. </p>

<p>The following expression returns the <code>book</code> element written by
"Millicent":</p> 
	
<eg role="xpath" xml:space="preserve">//book[.//author ftcontains "Millicent" ftor "Voltaire"]</eg>
</div3>

<div3 id="sec-ftand">
<head>And-Selection</head>
<p><termdef id="dt-and-selection" term="and-selection">An
<term>and-selection</term> combines two full-text selections using the 
<code>ftand</code> operator.</termdef>
</p>
       
<p>An and-selection finds matches that satisfy all of the operand full-text 
selections simultaneously. A match of an and-selection is formed by combining
matches for each of the operand full-text selections as described in
<specref ref="tq-ft-fs-FTAnd"/>. </p>

<p>For example, <code>"usability" ftand "testing"</code> will find two 
matches
in <code>//book[@number="1"]/title</code>: each of the two matches for the
FTWords selection <code>"usability"</code> (the two occurrences of 
 "usability" in the string value of the title element) is combined 
with the single match for the FTWords <code>"testing"</code> (only one 
occurrence of "testing" in the title).
Since the above and-selection has at least one match, the following
expression will return "true". </p>

<eg role="xpath" xml:space="preserve">//book[@number="1"]/title ftcontains ("usability" ftand "testing")</eg>

<p>The following expression returns false, because "Millicent" and "Montana" are not
contained by the same <code>author</code> element in any <code>book</code>
element:</p>

<eg role="xpath" xml:space="preserve">//book/author ftcontains "Millicent" ftand "Montana"</eg>

<p>No <code>author</code> element in any <code>book</code> element 
contains both "Millicent" and "Montana". Therefore, for any such 
<code>author</code> element, there are either one match for the 
FTWords <code>"Millicent"</code> and zero matches for the FTWords 
<code>"Montana"</code>, or vice versa, or no matches for both
of them. In any of these cases, the and-selection will have zero 
matches.</p>
</div3>

<div3 id="sec-ftmildnot">
<head>Mild-Not Selection</head>
<p><termdef id="dt-mild-not-selection" term="mild-not selection">A
<term>mild-not selection</term> combines two full-text selections 
using the <code>not in</code> operator.</termdef>
</p>

<p>The <code>not in</code> operator is a milder form of the operator combination
<code>ftand ftnot</code>. The selection <code>A not in B</code> matches a token
sequence that matches <code>A</code>, but not when it is a part of a 
match of <code>B</code>. 
In contrast, <code>A ftand ftnot B</code> only finds matches when the token 
sequence contains <code>A</code> and does not contain <code>B</code>.</p>

<p>
As an example, consider a search for <code>"Mexico" not in "New Mexico"</code>.
This may return, among others, a document
which is all about "Mexico" but mentions at the end that "New Mexico
was named after Mexico". The occurrence of "Mexico" in "New Mexico" is not 
considered, but other occurrences of "Mexico" are matched. Note that this
document would not be matched by the full-text selection 
<code>"Mexico" ftand ftnot "New Mexico"</code>.</p>

<p> A match to a mild-not selection must
contain at least one token that satisfies the first
condition and does not satisfy the second condition. If it contains a
token that satisfies both the first and the second
condition, the token is not considered as a match.</p>

<p>The following expression returns true, because "usability" appears in the
<code>title</code> and the <code>p</code> elements and the token within
the phrase "Usability Testing" in the <code>title</code> element is not
considered:</p>

<eg role="xpath" xml:space="preserve">/books/book ftcontains "usability" not in "usability testing"</eg>

<p>Operands of a mild-not selection may not contain a full-text selection
that evaluates to an <term>AllMatches</term> that contains a <term>StringExclude</term>. Such 
full-text selections are not-selection and 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt> with a cardinality constraint using <code>at most</code>,
<code>from ... to</code>, and <code>exactly</code> occurrences ranges. 
If such an expression is encountered, an error <errorref class="DY" code="0017"/>
is raised.
</p>
</div3>

<div3 id="sec-ftnot">
<head>Not-Selection</head>
<p><termdef id="dt-unary-not-selection" term="not-selection">A
<term>not-selection</term> is a full-text selection starting with the prefix 
operator <code>ftnot</code>.</termdef></p>

<p>A not-selection selects matches that do not
satisfy the operand full-text selection.
Details about how such matches are constructed are given in <specref ref="tq-ft-fs-FTUnaryNot"/>.
</p>

<p>The following expression returns the empty sequence, because all <code>book</code>
elements contain "usability":</p>
<eg role="xpath" xml:space="preserve">//book[. ftcontains ftnot "usability"]</eg>

<p>The following expression returns true, because <code>book</code> elements contain
"information" and "retrieval" but not "information retrieval":</p>

<eg role="xpath" xml:space="preserve">//book ftcontains "information" ftand
"retrieval" ftand ftnot "information retrieval"</eg>

<p>The following expression returns <code>book</code> elements containing "web site
usability" but not "usability testing":</p>

<eg role="xpath" xml:space="preserve">//book[. ftcontains "web site usability" ftand 
ftnot "usability testing"]</eg>
</div3>

</div2>

<div2 id="ftposfilter">
        <head>Positional Filters</head>

<scrap headstyle="show"><head/>
	<prod num="157" id="doc-xquery-FTPosFilter"><lhs>FTPosFilter</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTOrder" xlink:type="simple">FTOrder</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWindow" xlink:type="simple">FTWindow</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTDistance" xlink:type="simple">FTDistance</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTScope" xlink:type="simple">FTScope</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTContent" xlink:type="simple">FTContent</nt></rhs></prod>
</scrap>


<p><termdef id="dt-ftposfilter" term="positional filter">
<term>Positional filters</term> are postfix operators that serve to
filter matches based on various constraints on their positional
information.</termdef></p>

<p>
Recall that the grammar rule for <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTSelection" xlink:type="simple">FTSelection</nt>
allows an arbitrary number of positional filters to follow an
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTOr" xlink:type="simple">FTOr</nt>. Multiple adjacent positional filters are
applied from left to right, i.e., the first filter is applied to the
result of the <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTOr" xlink:type="simple">FTOr</nt>, the second is applied to the
result of that first application, and so on.
</p>

<div3 id="ftorder">
	<head>Ordered Selection</head>

<scrap headstyle="show"><head/>
<prod num="158" id="doc-xquery-FTOrder"><lhs>FTOrder</lhs><rhs>"ordered"</rhs></prod>
</scrap>

<p><termdef id="dt-ordered-selection" term="ordered selection">An
<term>ordered selection</term> consists of a full-text selection followed by 
the postfix operator "ordered".</termdef>

An ordered selection constrains the order of tokens and
phrases to be the same as the order in which they are written in the
operand selection.
</p>

<p> The default is unordered. Unordered is in effect when ordered is
not specified in the query. Unordered cannot be written explicitly in
the query.  </p>

<p>An ordered selection selects matches which satisfy the operand full-text
selection and which also satisfy the following constraint: the order
that the matching tokens or phrases have in the text being searched
is the same order that the corresponding query tokens or phrases have in the
operand selection. In both cases, the ordering is determined from the minimum
start positions of the contituent tokens.
</p>

<p>The following expression returns true, because titles of <code>book</code> elements
contain "web site" and "usability" in the order in which they are written in
the query, i.e., "web site" must precede "usability":</p>

<eg role="xpath" xml:space="preserve">//book/title ftcontains ("web site" ftand "usability") ordered</eg>

<p>The following expression returns false, because although "Montana" and "Millicent"
both appear in the <code>book</code> element, they do not appear in the order they
are written in the query:</p>

<eg role="xpath" xml:space="preserve">//book[@number="1"] ftcontains ("Montana" ftand "Millicent") ordered</eg>


</div3>

<div3 id="ftwindow">
	<head>Window Selection</head>

<scrap headstyle="show"><head/>
	<prod num="159" id="doc-xquery-FTWindow"><lhs>FTWindow</lhs><rhs>"window"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-AdditiveExpr" xlink:type="simple">AdditiveExpr</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTUnit" xlink:type="simple">FTUnit</nt></rhs></prod>
        <prod num="161" id="doc-xquery-FTUnit"><lhs>FTUnit</lhs><rhs>"words"  |  "sentences"  |  "paragraphs"</rhs></prod>
</scrap>

<p><termdef id="dt-window-selection" term="window selection">A
<term>window selection</term> consists of a full-text selection followed
by one of the (complex) postfix operators derived from <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWindow" xlink:type="simple">FTWindow</nt>.</termdef> 
A window selection selects matches which satisfy the operand full-text
selection and for which the matched tokens and phrases, more precisely the 
individual StringIncludes of that match, are found
within a number of <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTUnit" xlink:type="simple">FTUnit</nt>s (words, sentences, and paragraphs). The number of <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTUnit" xlink:type="simple">FTUnit</nt>s is
specified by an AdditiveExpr that is converted as though it were an argument to a
function with the expected type of <code>xs:integer</code>.</p>

<p>A window selection may cross element
boundaries. The size of the window is not affected by the presence or
absence of element boundaries. Stop words are included in the
computation of the window size whether they are ignored by the query or not.</p>


<p>
A window selection examines the matches generated by the preceding
portion of the <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTSelection" xlink:type="simple">FTSelection</nt>, and selects those for
which the matched 
tokens and phrases (more precisely, the individual StringIncludes of
that match) are all found within a window whose size is a specified
number of FTUnits (words, sentences, or paragraphs); for each such
window, the window selection then generates a match containing the
merge of those StringIncludes, plus any StringExcludes that fall
within the window.
</p>

<p>The following expression returns true, because "web", "site", and "usability" are
within a window of 5 tokens in the <code>title</code> element:</p>

<eg role="xpath" xml:space="preserve">/books/book/title ftcontains "web" ftand "site"
ftand "usability" window 5 words</eg>

<p>The following expression returns true, because "web" and "site" in the order they are
written in the query and either "usability" or "testing" are within a
window of at most 10 tokens:</p>

<eg role="xpath" xml:space="preserve">/books/book ftcontains ("web" ftand "site" ordered)
ftand ("usability" ftor "testing") window 10 words</eg>

<p>The following expression returns true, because the <code>title</code> element
contains "Web Site Usability". A similar query on the <code>p</code> element
would not return true, 
because its occurrences of "web site" and "usability" are not within a
window of 3:</p>
<eg role="xpath" xml:space="preserve">/books/book//title ftcontains "web site" ftand
"usability" window 3 words</eg>

<p>The following expression returns the sample <code>book</code> element, 
because its <code>number</code> attribute is 1 and it contains a
window of 2 words which contains an occurrence of "efficient"
but not an occurrence of "and". There is just one such matching window
in the sample text and it contains "enable efficient".</p>

<eg role="xpath" xml:space="preserve">/books/book[@number="1" and . ftcontains "efficient" 
ftand ftnot "and" window 2 words]</eg>

<p>The following expression returns the empty sequence, because in the selected
<code>book</code> element, there is no occurrence of "efficient"
within a window of 3 tokens which would not also contain an occurrence
of "and":</p>

<eg role="xpath" xml:space="preserve">/books/book[@number="1" and . ftcontains "efficient" 
ftand ftnot "and" window 3 words]</eg>

<p>
In order to allow meaningful results for nested positional filters,
e.g., a window selection embedded inside a distance selection, the
resulting matches for window selections are formed from the input matches
that satisfy the window constraint as follows. All StringIncludes of
such a match are coerced into a single StringInclude that spans all
token positions from the smallest to the largest position of any input
StringIncludes. This is explained in more detail in Section <specref ref="ftdistance"/>.
</p>
</div3>

<div3 id="ftdistance">
	<head>Distance Selection</head>

<scrap headstyle="show"><head/>
<prod num="160" id="doc-xquery-FTDistance"><lhs>FTDistance</lhs><rhs>"distance"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTRange" xlink:type="simple">FTRange</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTUnit" xlink:type="simple">FTUnit</nt></rhs></prod>
<prod num="156" id="doc-xquery-FTRange"><lhs>FTRange</lhs><rhs>("exactly"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-AdditiveExpr" xlink:type="simple">AdditiveExpr</nt>)<br/>|  ("at"  "least"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="pro