<!-- note: need to add all termdefs, otherwise termrefs dont work : but XQ contains dupes of termdefs in intro --><!-- SB 2004-01-28: added entities date.day thru ndash, removed "xpath-backwards-compat", "errors", "XQ" --><spec xmlns:e="http://www.w3.org/1999/XSL/Spec/ElementSyntax" id="spec-top" w3c-doctype="wd">
<header>
<title>XQuery 1.0 and XPath 2.0 Full-Text 1.0</title>
<w3c-designation>WD-xpath-full-text-10</w3c-designation>
<w3c-doctype>W3C Working Draft</w3c-doctype>
<pubdate>
 <day>18</day>
 <month>May</month>
 <year>2007</year>
</pubdate>

<publoc>
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/2007/WD-xpath-full-text-10-20070518/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">http://www.w3.org/TR/2007/WD-xpath-full-text-10-20070518/</loc>
</publoc>

<altlocs>
   <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/2007/WD-xpath-full-text-10-20070518/xpath-full-text.xml" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">XML</loc>
</altlocs>

<latestloc>
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/xpath-full-text-10/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">http://www.w3.org/TR/xpath-full-text-10/</loc>
</latestloc>

<prevlocs>
  <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/2006/WD-xquery-full-text-20060501/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest"/>
  <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/2005/WD-xquery-full-text-20051103/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest"/>
  <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/2005/WD-xquery-full-text-20050915/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest"/>
  <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/2005/WD-xquery-full-text-20050404/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest"/>
  <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/2004/WD-xquery-full-text-20040709/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest"/>
</prevlocs>

<authlist>
<author>
	<name>Sihem Amer-Yahia</name>
	<affiliation>AT&amp;T Labs - Research</affiliation>
<!--	<email href="mailto:sihem@research.att.com">sihem@research.att.com</email> -->
</author>
<author>
	<name>Chavdar Botev</name>
	<affiliation>Invited Expert</affiliation>
<!--	<email href="mailto:cbotev@cs.cornell.edu">cbotev@cs.cornell.edu</email> -->
</author>
<author>
	<name>Stephen Buxton</name>
	<affiliation>Mark Logic Corporation</affiliation>
<!--	<email href="mailto:stephen.buxton@marklogic.com">stephen.buxton@marklogic.com</email> -->
</author>
<author>
	<name>Pat Case</name>
	<affiliation>Library of Congress</affiliation>
<!--	<email href="mailto:pcase@crs.loc.gov">pcase@crs.loc.gov</email> -->
</author>
<author>
  <name>Jochen Doerre</name>
  <affiliation>IBM</affiliation>
<!--  <email href="mailto:doerre@de.ibm.com">doerre@de.ibm.com</email> -->
</author>
<author>
	<name>Mary Holstege</name>
	<affiliation>Mark Logic Corporation</affiliation>
<!--	<email href="mailto:mary.holstege@marklogic.com">mary.holstege@marklogic.com</email> -->
</author>
<!-- <author>
	<name>Darin McBeath</name>
	<affiliation>Elsevier</affiliation>
	<email href="mailto:D.McBeath@elsevier.com">D.McBeath@elsevier.com</email>
</author> -->
<author>
	<name>Jim Melton</name>
	<affiliation>Oracle</affiliation>
<!--  <email href="mailto:jim.melton@oracle.com">jim.melton@oracle.com</email> -->
</author>
<author>
	<name>Michael Rys</name>
	<affiliation>Microsoft</affiliation>
<!--	<email href="mailto:mrys@microsoft.com">mrys@microsoft.com</email> -->
</author>
<author>
	<name>Jayavel Shanmugasundaram</name>
	<affiliation>Invited Expert</affiliation>
<!--	<email href="mailto:jai@cs.cornell.edu">jai@cs.cornell.edu</email> -->
</author>
</authlist>

<abstract>
<p>This document defines the syntax and formal semantics of XQuery 1.0 and XPath 2.0 Full-Text 1.0
which is a language that extends XQuery 1.0 <bibref ref="xquery"/>
and XPath 2.0 <bibref ref="xpath20"/> with full-text search capabilities.</p>
</abstract>

<status>

<p>
<emph>This section describes the status of this document at the time of its publication. 
Other documents may supersede this document. 
A list of current W3C publications and the latest revision of this technical report can be 
found in the <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">W3C technical reports index</loc> at 
http://www.w3.org/TR/.</emph>
</p>

<p>
This is a
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/2005/10/Process-20051014/tr.html#last-call" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">Last Call Working Draft</loc>
for review by W3C Members and other interested parties.
This document was produced following the procedures set out for the W3C Process and
was defined jointly by the 
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/Style/XSL/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">XSL Working Group</loc>
and the 
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/XML/Query" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">XML Query Working Group</loc>
(both part of the 
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/XML/Activity.html" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">XML Activity</loc>).
It is designed to be read in conjunction with the following documents:
<bibref ref="xquery"/>,
<bibref ref="xqueryft-requirements"/>, and
and <bibref ref="xmlquery-full-text-use-cases"/>. 
</p>

<p>
Publication as a Working Draft does not imply endorsement by the W3C Membership. 
This is a draft document and may be updated, replaced or obsoleted by other documents at 
any time. 
It is inappropriate to cite this document as other than work in progress.
</p>

<p>
This document defines a language for expressing full-text queries on XML documents; the 
language is specified in the form of extensions to both
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/xpath20" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">XPath 2.0</loc> and
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/TR/xquery" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">XQuery 1.0</loc>.  Organizations and individuals should review 
this document to determine the degree to which the language specified meets the needs of 
the full-text community.  The Working Groups believe that this work is essentially 
complete and intend to advance it as soon as possible. 
</p>

<p>
This is the sixth version of this document. 
Since the last version was published several technical and editorial
changes have been made. 
Among the most significant changes are:
The formal semantics diagrams have been redrawn. 
A conformance statement has been added. 
XML Schemas that together define the XML representation of
XQuery 1.0 and XPath 2.0 Full-Text have been added,
along with a stylesheet to transform that XML representation to the
ordinary XQuery syntax. 
Section 3 has been significantly restructured for clarity and readability. 
The semantics of nesting FTDistance selections have been made more useful. 
The semantics for FTMildNot now properly handle phrases. 
See Appendix <specref ref="id-xqft-changelog"/>
for more information on these and other changes.
</p>

<p>
Of the XQuery 1.0 and XPath 2.0 Full Text documents, only this document,
XQuery 1.0 and XPath 2.0 Full-Text 1.0, is a Last Call document. 
The XQuery and XPath Full-Text Requirements <bibref ref="xqueryft-requirements"/>,
although not on the Recommendation track, is being republished concurrently
with this document in order to demonstrate the degree to which this document
satisfies those Requirements.  
The XQuery Full-Text Use Cases <bibref ref="xmlquery-full-text-use-cases"/>
document, although not on the Recommendation track, is being republished concurrently
with this document in order to illustrate various use cases that guided the design of the 
XQuery 1.0 and XPath 2.0 Full Text specification. 
</p>

<p>
Public Last Call comments on this document and its open issues are invited.
Comments on this document are due by 22 June 2007.
Comments on this document should be made in W3C's
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/Bugs/Public/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">public Bugzilla system</loc>
for this specification (instructions can be found at
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/XML/2005/04/qt-bugzilla" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">http://www.w3.org/XML/2005/04/qt-bugzilla</loc>). 
When entering comments, select the Product named "XPath / XQuery / XSLT", the Component 
named "Full Text", and the Version named "Last Call drafts". 
This repository includes open issues recorded by the Query Working Group as well as by 
members of the public. 
If access to the Bugzilla system is not feasible, you may send your comments to the W3C 
XSLT/XPath/XQuery mailing list,
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="mailto:public-qt-comments@w3.org" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">public-qt-comments@w3.org</loc>
It will be very helpful if you include the string [FT] in the subject line of your 
comment, whether made in Bugzilla or in email. 
Each Bugzilla entry and email message should contain only one comment. 
Archives of the comments and responses are available at
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://lists.w3.org/Archives/Public/public-qt-comments/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">http://lists.w3.org/Archives/Public/public-qt-comments/</loc>.
</p>

<p>
This document was produced by groups operating under the
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/Consortium/Patent-Policy-20040205/" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">5 February 2004
W3C Patent Policy</loc>.
W3C maintains a 
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/2004/01/pp-impl/18797/status#disclosures" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">public list of any 
patent disclosures</loc>
made in connection with the deliverables of the 
XML Query Working Group and also maintains a 
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/2004/01/pp-impl/19552/status#disclosures" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">public list of any patent 
disclosures</loc> made in connection with the deliverables of the XSL 
Working Group; those pages also include instructions for
disclosing a patent.
An individual who has actual knowledge of a patent which the individual believes contains
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/Consortium/Patent-Policy-20040205/#def-essential" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">Essential 
Claim(s)</loc>
must disclose the information in accordance with
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">section 6 of the W3C 
Patent Policy</loc>. 
</p>

</status>

<langusage>
		<language id="EN">English</language>
		<language id="ebnf">EBNF</language>
</langusage>
	
<revisiondesc>

<p>SA January 2004: First version of document before Feb F2F</p>
<p>SA 26 February 2004: Second version of document before Feb F2F
    meetings.</p>
<p>JM 18 May 2007: Last Call Working Draft</p>

</revisiondesc></header>
<body>
<!-- *********************************************************************
      Section 1. Introduction
     ********************************************************************* -->

<div1 id="introduction">
  <head>Introduction</head>

  <p>This document defines the language and the formal semantics of
  XQuery 1.0 and XPath 2.0 Full-Text 1.0. This language is designed to meet the requirements
  identified in W3C XQuery and XPath Full-Text Requirements
  <bibref ref="xqueryft-requirements"/> and to support the queries in
  the W3C XQuery Full-Text Use Cases <bibref ref="xmlquery-full-text-use-cases"/>. </p> 

  <p>XQuery 1.0 and XPath 2.0 Full-Text 1.0 extends the syntax and semantics of XQuery 1.0 and
  XPath 2.0. </p>

	<div2 id="tq-ftsearch-xml">
		<head>Full-Text Search and XML</head> 

<p>As XML becomes mainstream, users expect to be able to 
search their XML documents. This requires a standard way to do
full-text search, as well as structured searches, against XML
documents.  A similar requirement for full-text search led ISO to
define the <!-- <loc
href="ftp://sqlstandards.org/SC32/WG4/Progression_Documents/CD/cd-fulltext-2001-05.pdf">SQL/MM-FT
standard</loc>.  --> SQL/MM-FT <bibref ref="sqlmm"/> standard.
SQL/MM-FT defines extensions to SQL to express
full-text searches providing functionality similar to that defined in this full-text
language extension to XQuery 1.0 and XPath 2.0.
</p>

<p>XML documents may contain highly structured data (fixed schemas, known types
such as numbers, dates), semi-structured data (flexible schemas and types),
markup data (text with embedded tags), and unstructured data (untagged
free-flowing text). Where a document contains unstructured
or semi-structured data, it is important to be able to search using
Information Retrieval techniques such as scoring and weighting.</p>

<p>Full-text search is different from substring search in many ways:</p>

<olist>
<item><p>A full-text search searches for tokens and phrases
rather than substrings. A substring search for news items that contain
the string "lease" will return a news item that contains "Foobar
Corporation releases the 20.9 version ...". A full-text search for the
token "lease" will not. </p>
</item>

<item><p>There is an expectation that a full-text search will support
language-based searches which substring search cannot. An
example of a language-based search is "find me all the news items that
contain a token with the same linguistic stem as "mouse" (finds "mouse"
and "mice"). Another example based on token proximity is "find me all
the news items that contain the tokens "XML" and
"Query" allowing up to 3 intervening words.</p>
</item>

<item>
<p>Full-text search must address the vagaries and nuances of
language. Search results  are often of varying usefulness. When
you search a web site for cameras that cost less than $100, this
is an exact search.  There is a set of cameras that matches this search,
and a set that does not.  Similarly, when you do a string search across
news items for "mouse", there is only 1 expected result set. When you
do a full-text search for all the news items that contain the
token "mouse", you probably expect to find news items containing the token
"mice", and possibly "rodents", or possibly "computers".  Not
all results are equal. Some results are more "mousey" than others.
Because full-text search may be inexact, we have the notion of score
or relevance. We generally expect to see the most relevant results at
the top of the results list.</p>
<p>As XQuery and XPath evolve, they
 may apply the notion of
score to querying structured data. For example, when making travel
plans or shopping for cameras, it is sometimes useful to get an
ordered list of near matches in addition to exact matches. If
 XQuery and XPath define a generalized 
inexact match, we expect XQuery and XPath to utilize the scoring
framework provided by XQuery and XPath Full-Text.
</p>
</item>
</olist>

<p>The following definitions apply to full-text search:</p>

<olist>

<item>
<p><termdef id="Full-TextQueriesDef" term="Full-TextQueries"><term>Full-text queries</term> are 
   performed on tokens and phrases. Tokens and phrases are produced via
   tokenization.</termdef> Informally, tokenization breaks a character string into a 
    sequence of words, units of punctuation, and spaces.</p>
</item>

<item>
<p><termdef id="TokenDef" term="Token">A <term>token</term> is defined
as a character, n-gram, or sequence of
characters returned by a tokenizer as a basic unit to be
searched. Each instance of a token consists of one or more consecutive
characters.  Beyond that, tokens are <termref def="dt-implementation-defined">implementation-defined</termref>.</termdef> Note that
consecutive tokens need not be separated by either punctuation or
space, and tokens may overlap. <termdef id="PhraseDef" term="Phrase">A <term>phrase</term> is an ordered sequence of any number of tokens. Beyond that,
 phrases are <termref def="dt-implementation-defined">implementation-defined</termref>.</termdef></p> 

<note><p>In some natural languages, tokens and words can be used
interchangeably.</p></note>
</item>

<item>
<p>Tokenization enables functions and operators that operate on a
part or the root of the token (e.g., wildcards, stemming). </p>

<p>Tokenization enables functions and operators which work with the
relative positions of tokens (e.g., proximity operators). </p>

<p>Tokenization also uniquely identifies sentences and paragraphs in which tokens appear. 
<termdef id="SentenceDef" term="Sentence">A <term>sentence</term> is an ordered sequence
of any number of tokens. 
Beyond that, sentences are <termref def="dt-implementation-defined">implementation-defined</termref>. 
A tokenizer is not required to support sentences.</termdef>
<termdef id="ParagraphDef" term="Paragraph">A <term>paragraph</term> is an ordered sequence
of any number of tokens. 
Beyond that, paragraphs are <termref def="dt-implementation-defined">implementation-defined</termref>. 
A tokenizer is not required to support paragraphs.</termdef>
Whatever a tokenizer for a particular language chooses to do, it must preserve
the containment hierarchy: paragraphs contain sentences, which contain
tokens. </p>

<p>The tokenizer has to process two codepoint equal strings in the same way,
i.e., it should identify the same tokens. 
Everything else about the behavior of the tokenizer is <termref def="dt-implementation-defined">implementation-defined</termref>.</p>
</item>


<item>
<p>
This specification focuses on functionality that serves all
languages. It also selectively includes functionalities useful within
specific families of languages. For example, searching within
sentences and paragraphs is useful to many western languages and to
some non-western languages, so that functionality is incorporated into
this specification.
</p>
</item>

<item>
<p>
Some XML elements represent semantic
markup, e.g., &lt;title&gt;. Others represent formatting markup, e.g.,
&lt;b&gt; to indicate bold.  Semantic markup serves well as token
boundaries. Some formatting markup serves
well as token boundaries, for example, paragraphs are most commonly delimited
by formatting markup. Other formatting markup may not serve well as token
boundaries. Implementations
are free to provide <termref def="dt-implementation-defined">implementation-defined</termref> ways to differentiate between 
the markup's effect on token boundaries during tokenization.
</p>
</item>


<!--<item>
<p>We use the namespace "ft" (for full-text) that corresponds to the
URL http://www.w3.org/2004/07/xquery-full-text and defines the namespace of
full-text search. We also use "fts" for definitional purposes in <loc
href="#tq-semantics">semantics Section</loc>.
</p>
</item>
-->

</olist>
		<p>Certain aspects of language
		processing are described in this specification as
		<term>implementation-defined</term> or
		<term>implementation-dependent</term>.</p>

<ulist>
  <item>
    <p><termdef id="dt-implementation-defined" term="implementation defined"><term>Implementation-defined</term>
		indicates an aspect that may differ between
		implementations, but must be specified by the
		implementor for each particular
		implementation.</termdef></p>
  </item>
  <item>
    <p>
      <termdef id="dt-implementation-dependent" term="implementation   dependent"><term>Implementation-dependent</term>
		indicates an aspect that may differ between
		implementations, is not specified by this or any W3C
		specification, and is not required to be specified by
		the implementor for any particular
		implementation.</termdef></p>
  </item>
</ulist>


</div2>

<div2 id="tq-ft-organization">
 <head>Organization of this document</head> 

<p>This document is organized as follows. We first present a <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#tq-extensions" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">high level syntax</loc> for the XQuery 1.0 and XPath 2.0 Full-Text 1.0
language along with some examples. Then, we present the <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#ftselections" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">syntax and examples</loc> of the
basic primitives in the XQuery 1.0 and XPath 2.0 Full-Text 1.0 language. This is followed by the
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#tq-semantics" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">semantics</loc> of the XQuery 1.0 and XPath 2.0 Full-Text 1.0
language. The appendix contains a section that provides an <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#id-xpath-grammar" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">EBNF for the XPath 2.0 Grammar with Full-Text
extensions</loc>, an <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#id-grammar" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">EBNF for XQuery 1.0
Grammar with Full-Text extensions</loc>, <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#ft-acknowledgements" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">acknowledgements</loc> and a <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#ft-glossary" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">glossary</loc>.</p>

	</div2>

<div2 id="tq-ft-namespaces">
  <head>A word about namespaces</head>

<p>Certain namespace prefixes are predeclared by XQuery 1.0 and, by implication, by this specification,
and bound to fixed namespace URIs. These namespace prefixes are as follows:
</p>

<ulist>

<item>
<p>
<code>xml = http://www.w3.org/XML/1998/namespace</code>
</p>
</item>

<item>
<p>
<code>xs = http://www.w3.org/2001/XMLSchema</code>
</p>
</item>

<item>
<p>
<code>xsi = http://www.w3.org/2001/XMLSchema-instance</code>
</p>
</item>

<item>
<p>
<code>fn = http://www.w3.org/2005/xpath-functions</code>
</p>
</item>

<item>
<p>
<code>local = http://www.w3.org/2005/xquery-local-functions</code>
</p>
</item>

</ulist>

<p>
In addition to the prefixes in the above list, this document uses the prefix
<code>err</code> to represent the namespace URI <code>http://www.w3.org/2005/xqt-errors</code>, 
This namespace prefix is not predeclared and its use in this document is not normative. 
Error codes that are not defined in this document are defined in other XQuery 1.0 and XPath 2.0
specifications, particularly <bibref ref="xpath20"/> and <bibref ref="xpath-functions"/>. 
</p>

<p>
Finally, this document uses the prefix <code>fts</code> to represent a namespace
containing a number of functions used in this document to describe the semantics
of XQuery 1.0 and XPath 2.0 Full-Text functions. There is no
requirement that these functions be implemented, therefore no URI is associated with that prefix. 
</p>

</div2>
  
</div1>

<!--
2. TeXQuery Expressions
2.1   Processing Model
2.2   FTContainsExpr 
2.3   Scoring
2.3   Extensions to the Static Context

-->
<div1 id="tq-extensions">
   <head>Full-Text Extensions to XQuery and XPath</head>

<p>XQuery 1.0 and XPath 2.0 Full-Text extends the languages of XQuery
1.0 and XPath 2.0 in three ways. It:</p> 

<olist>
  <item><p>Adds a new expression called FTContainsExpr;</p>
  </item>

  <item><p>Enhances the syntax of FLWOR expressions in XQuery 1.0 and
  <code>for</code> expressions in XPath 2.0 with optional score
  variables; and</p>
  </item>

  <item><p>Adds static context declarations for full-text match
  options to the query prolog.</p>
  </item>
</olist>

<p>Additionally, it extends the data model and processing models in
various ways.</p>

<div2 id="processing-model">
<head>Processing Model</head>

<p>
As part of the External Processing that is described in the XQuery
Processing Model, when an XML document is parsed into an Infoset/PSVI
and ultimately into a XQuery Data Model instance, a
full-text process called tokenization is usually executed.</p> 

<p>
Tokenization, in general terms, is the process of converting a text
string into smaller units that are used in query processing. Those
units, called tokens, are the most basic text units that a full-text
search can refer to. Full-text operators typically work on sequences
of token occurrences found in the target text (nodes) of a
search. These token occurrences are characterized by unique
identifiers that capture the relative position of the token inside the
string, the relative position of the sentence containing the token,
and the relative position of the paragraph containing the token.</p>

<p>
Tokenization, including the definition of the term "words", <termref def="should">SHOULD</termref> be
<termref def="dt-implementation-defined">implementation-defined</termref>. 
Implementations <termref def="should">SHOULD</termref> expose the rules and sample
results of tokenization as much as possible to enable users to predict and
interprete the results of tokenization. 
Tokenization <termref def="must">MUST</termref> only conform to these constraints:</p>

<olist>

<item><p>
Each word <termref def="must">MUST</termref> consist of one or more consecutive characters;</p>
</item>

<item><p>
The tokenizer <termref def="must">MUST</termref> preserve the containment hierarchy
(<emph>e.g.</emph>, paragraphs contain sentences, which contain words); and</p>
</item>

<item><p>
The tokenizer <termref def="must">MUST</termref>, when tokenizing two equal strings,
identify the same tokens in each. </p>
</item>

</olist>

<p>
A sample tokenization is used for the examples in this document. 
The results might be different for other tokenizations. </p>

<p>
A <termref def="dt-ftcontains">full-text contains expression</termref>
(<specref ref="section-ftcontainsexpr"/>), evaluated within
the normal Query Processing (XQuery Processing Model), is composed of
several parts:</p>

<olist>

  <item><p>
  An XPath 2.0 or XQuery 1.0 expression (RangeExpr) that
  specifies the sequence of items to be searched. 
  <termdef id="dt-search-context" term="search context">
  Those items are called
  the <term>search context</term>.</termdef></p>
  </item>

  <item><p>
  The full-text selection to be applied (<specref ref="ftselections"/>).
  <term>Full-text selections</term> 
  are, syntactically and semantically, fully composable and contain:
  </p>
  <ulist>

    <item><p>
    Required:</p>

    <ulist>

      <item><p>
      Words and phrases for which a search is performed (<specref ref="ftwords"/>).</p>
      </item>

    </ulist>

    </item>

    <item><p>
    Optional:</p>

    <ulist>

      <item><p>
      Match options, such as indicators for case sensitivity and stop
      words (<specref ref="ftmatchoptions"/>);</p>
      </item>

      <item><p>
      Boolean full-text operators, that compose a full-text selection from
      simpler full-text selections (<specref ref="logical_ftoperators"/>);</p>
      </item>

      <item><p>
      Other full-text operators that are constraints on the positions of
      matches, such as indicators for distance between tokens and for the
      cardinality of matches (<specref ref="ftposfilter"/> and 
      <specref ref="fttimes"/>); and</p>
      </item>

      <item><p>
      The weighting information. Each individual search term in a
      full-text selection may be annotated with optional weight
      information. This information may be used during the evaluation
      of the full-text selections to
      calculate scoring, information that quantifies the relevance of the
      result to the given search criteria.</p>
      </item>

    </ulist>

    </item>

  </ulist>

  </item>

  <item><p>
  An optional XPath 2.0 or XQuery 1.0 expression (UnionExpr) that
  specifies the set of nodes, descendents of the RangeExp, which
  contents may be ignored for the purpose of determining a match
  during the search (<specref ref="ftignoreoption"/>).</p>
  </item>

</olist>

<p>
The results of the evaluation of the full-text selection operators are
instances of the AllMatches model, which complements the XQuery Data
Model (XDM) for processing full-text queries. An AllMatches instance
describes all possible solutions to the full-text query for a given
search context item. Each solution is described by a Match instance. A
Match instance contains the tokens from the search context that must
be included (described using StringInclude instances which model the
positive terms) and the tokens from search context item that must be
excluded (described using StringExclude instances which model the
negative terms). Each negative or positive term is modeled as a tuple:
the position of the query word or phrase in the full-text selection, and a
TokenInfo structure that describes a consecutive sequence of token
occurrences in the text string which match the query word or phrase.
</p>

<graphic xmlns:xlink="http://www.w3.org/1999/xlink" source="images/ProcMod-XQueryFT.gif" alt="Processing Model Extensions" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad"/>

<p>Figure 1 provides a schematic overview of the XQuery 1.0 and XPath
2.0 Full-Text processing steps that are discussed in detail below. 
Some of these steps are completely outside the domain of XQuery; in
Figure 1, these are depicted outside the black line that represents
the boundaries of the language. The diagram only shows the central pieces
of the XQuery Processing Model (see <xspecref spec="XQ" ref="id-processing-model"/>), however zooms in on the Execution Engine
where the processing of the Full-Text extensions takes place. The
full-text processing steps are labeled as FTn within the diagram and
are referenced within the text.</p>

<p>
Like all XQuery expressions, an FTContainsExpr returns an XDM
Instance (see Fig. 1). With the exception of FTWords, which consumes TokenInfos,
all full-text selections are closed under the AllMatches data model,
i.e., their input and output are AllMatches instances. Tokenization
normally occurs at the time of parsing of the original XML 
documents, for example, during the Data Model Generation process (see
Figure 1). But here it may also occur "on-the-fly" transforming an XDM
instance into TokenInfos, which ultimately get converted into AllMatches
instances by the evaluation of full-text selections. Thus, the evaluation of
nested full-text and XQuery expressions instances moves back and forth
between these two models.
</p>

<p>
The resulting AllMatches instance obtained by the evaluation of a Full
Text expression is converted into a Boolean value before being
returned to the enclosing XPath or XQuery operation as follows. If at
least one member of the disjunction contains only positive terms then
value returned is true. If all members of the disjunction contain
negative terms the result is false.
</p>

<p>
Weighting information, in an <termref def="dt-implementation-dependent">implementation-dependent</termref> fashion, may be
used when calculating the scoring information computed and made
available by FTContainsExpr to the optional score construct.
</p>

<p>
Given the components of a given Full Text expression, the evaluation
algorithm will proceed according to the following steps, also referenced in the processing model diagram as steps FT<emph>n</emph> (see Fig. 1):
</p>

<olist>

  <item><p>
  Evaluate the search context expression, resulting in the set of
  search context items; (FT1 provides the evaluation of any XPath 2.0
  or XQuery 1.0 expressions that generates or modifies the search
  context, as well as the query string(s) in a partially evaluated
  full-text selection)</p>
  </item>

  <item><p>
  Evaluate the (optional) ignore expression, resulting in the set of ignored
  nodes and virtually delete the ignore nodes from the search context
  nodes tree. (Included in FT1)</p>
  </item>

  <item><p>
  Apply the tokenization algorithm to query string(s). </p>
  </item>

  <item><p>
  For each search context item:</p>

  <olist>

    <item><p>
    Apply the tokenization algorithm in order to extract potentially
    matching terms together with their positional information. 
    This step results in a sequence of token occurrences. </p> 
    </item>

    <item><p>
    Evaluate the simple "FTWord" operators in the full-text selection against
    the tokenized input. This results in a set of AllMatches instances.
  (FT3)</p>
    </item>

    <item><p>
    Evaluate the rest of the full-text selection operator tree in a bottom up
    fashion. At each step the AllMatches instance produced by the
    previous steps are given as input, and a new instance of the
    AllMatches is obtained as output. At each step the FTMatchOptions
    are controlling the semantics of the application of the FTWords
    operator. (FT4)</p>
    </item>

  </olist>

  </item>

  <item>
  
  <p>
  Convert the AllMatches instance into a Boolean value. (FT5)</p>

<!-- Bugzilla Bug# 3908 -->
  <p>
  The additional scoring information (also part of FT5) that is produced
  by the evaluation 
  of the Full Text expression is <termref def="dt-implementation-dependent">implementation-dependent</termref> and is not
  specified in this document. The scoring information is made available at the same time the
  Boolean value is returned.
  </p>

  </item>

</olist>

<!-- Bugzilla Bug# 3908 -->
<p>
Section <specref ref="ftselections"/>
describes the syntax and the informal semantics of Full Text operators. 
Their formal semantics as well as the formal definition of the
AllMatches data model are given in Section <specref ref="tq-semantics"/>.
</p>

</div2>


<div2 id="section-ftcontainsexpr">
   <head>Full-text Contains Expression</head>

<p>
<termdef id="dt-ftcontains" term="full-text contains expression">A
<term>full-text contains expression</term> is a expression that evaluates a
sequence of nodes against a full-text selection.
</termdef>
</p>
<p>As a syntactic construct a full-text contains expression
(grammar symbol: <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTContainsExpr" xlink:type="simple">FTContainsExpr</nt>) 
behaves like a
comparison expression (see <xspecref spec="XQ" ref="id-general-comparisons"/>).
This grammar rule introduces <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTContainsExpr" xlink:type="simple">FTContainsExpr</nt>.</p>


<scrap headstyle="show">
<head/>
<prod num="50" id="noid_N10466.doc-xquery-ComparisonExpr"><lhs>ComparisonExpr</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTContainsExpr" xlink:type="simple">FTContainsExpr</nt> ( (<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-ValueComp" xlink:type="simple">ValueComp</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-GeneralComp" xlink:type="simple">GeneralComp</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-NodeComp" xlink:type="simple">NodeComp</nt>)  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTContainsExpr" xlink:type="simple">FTContainsExpr</nt> )?</rhs></prod>
</scrap>

<p>A full-text contains expression may be used anywhere a
ComparisonExpr may be 
used. The <code>ftcontains</code> operator has higher precedence than
other comparison operators,  so the results of <code>ftcontains</code>
expressions may be compared without enclosing them in parentheses.</p>

<div3 id="section-ftcontainsexpr-description">
      <head>Description</head>


<scrap headstyle="show">
<head/>
<prod num="51" id="doc-xquery-FTContainsExpr"><lhs>FTContainsExpr</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-RangeExpr" xlink:type="simple">RangeExpr</nt> ( "ftcontains"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTSelection" xlink:type="simple">FTSelection</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTIgnoreOption" xlink:type="simple">FTIgnoreOption</nt>? )?</rhs></prod>
</scrap>

<p>A full-text contains expression returns a Boolean
value. It returns true if there is some node in
the RangeExpr that, after 
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#TokenizationSec" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">tokenization</loc>, 
matches the full-text selection <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTSelection" xlink:type="simple">FTSelection</nt>. See Section
<specref ref="ftselections"/> for more details. 
For the purpose of determining
a match, certain descendants of nodes (identified by 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTIgnoreOption" xlink:type="simple">FTIgnoreOption</nt>) in 
the RangeExpr may be ignored, as specified in Section
<specref ref="ftignoreoption"/>.</p>

<p>An XQuery 1.0 and XPath 2.0 Full-Text processor <termref def="should">SHOULD</termref> try to use the
information available in xml:lang for processing of collations, as well as
the various match options defined in Section <specref ref="ftmatchoptions"/>. 
</p>

</div3>

<div3 id="section-ftcontainsexpr-examples">
   <head>Examples</head>

<p>The following example in XQuery 1.0 Full-Text returns the author of
each book with a title containing a token with the same root as
<code>dog</code> and the token
<code>cat</code>.

		<eg role="xquery" xml:space="preserve">
for $b in /books/book
where $b/title ftcontains ("dog" with stemming) ftand "cat" 
return $b/author</eg>
</p>
		<p>The same example in XPath 2.0 Full-Text is written as:

		<eg role="xpath" xml:space="preserve">

/books/book[title ftcontains ("dog" with stemming) ftand "cat"]/author</eg>
</p>
<p>This example selects books where either the title contains the token
<code>dog</code> and the token <code>cat</code> and the content
does not contain a token with the same root as <code>train</code>, or where the
title fails to have one of the matching tokens but the content does:</p>
<eg role="xquery" xml:space="preserve">
/books/book[title ftcontains "dog" ftand "cat" ne
            content ftcontains ("train" with stemming)]
</eg>
   </div3>

	</div2>

	<div2 id="section-score-variables">
	<head>Score Variables</head>
	<p>Besides specifying a match of a full-text 
        search as a Boolean condition, full-text search applications
        typically also have the ability to associate scores with
        the results. <termdef id="Scores" term="Scores"><term>Scores</term> express the relevance of 
      those results to the full-text search conditions.</termdef></p>

        <p>XQuery 1.0 and XPath 2.0 Full-Text extends the languages of
        XQuery 1.0 and XPath 2.0 further  by adding optional 
        <code>score</code> variables to the <code>for</code> and
        <code>let</code> clauses of FLWOR expressions.</p>

        <p>The production for the extended <code>for</code> clause follows.


<scrap headstyle="show">
<head/>
<prod num="35" id="doc-xquery-ForClause"><lhs>ForClause</lhs><rhs>"for"  "$"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-VarName" xlink:type="simple">VarName</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-TypeDeclaration" xlink:type="simple">TypeDeclaration</nt>?  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-PositionalVar" xlink:type="simple">PositionalVar</nt>?  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTScoreVar" xlink:type="simple">FTScoreVar</nt>?  "in"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-ExprSingle" xlink:type="simple">ExprSingle</nt>  (","  "$"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-VarName" xlink:type="simple">VarName</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-TypeDeclaration" xlink:type="simple">TypeDeclaration</nt>?  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-PositionalVar" xlink:type="simple">PositionalVar</nt>?  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTScoreVar" xlink:type="simple">FTScoreVar</nt>?  "in"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-ExprSingle" xlink:type="simple">ExprSingle</nt>)*</rhs></prod>
<prod num="37" id="doc-xquery-FTScoreVar"><lhs>FTScoreVar</lhs><rhs>"score"  "$"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-VarName" xlink:type="simple">VarName</nt></rhs></prod>
</scrap> 
</p>

<p>When a <code>score</code> variable is present in a <code>for</code> 
clause the evaluation of the expression following the <code>in</code>
keyword not only needs to determine the result sequence of the
expression, i.e., the sequence of items which are iteratively
bound to the <code>for</code> variable. It must also determine in each
iteration the relevance "score" value of the current item
and bind the <code>score</code> variable to that value. </p> 

<p>The semantics of scoring and how it relates to second-order functions is 
discussed in Section <specref ref="ScoreSec"/>.</p>

<p>In the following example <code>book</code> elements are determined that satisfy
the condition <code>[content ftcontains "web site" ftand "usability" and
.//chapter/title ftcontains "testing"]</code>. The scores assigned to the
<code>book</code> elements are returned.

		<eg role="xquery" xml:space="preserve">
for $b score $s 
    in /books/book[content ftcontains "web site" ftand "usability" 
                   and .//chapter/title ftcontains "testing"]
return $s
</eg>
</p>

<p>XPath 2.0 Full-Text extends the language of XPath
2.0 in the <code>for</code> expression in the same 
way: with optional score variables. The example above is
also a legal example of the XPath 2.0 extension.</p>

<p>Scores are typically used to order results, as in the 
following, more complete example.
		<eg role="xquery" xml:space="preserve">
for $b score $s 
    in /books/book[content ftcontains "web site" ftand "usability"]
where $s &gt; 0.5
order by $s descending
return &lt;result&gt;  
          &lt;title&gt; {$b//title} &lt;/title&gt; 
          &lt;score&gt; {$s} &lt;/score&gt; 
       &lt;/result&gt;
</eg>
</p>

<p>Note that the score applies to the entire <code>for</code> expression.
In the following example, two separate full-text contains expressions are
used to select the matching paragraphs.  There is still just one score for each
<code>para</code> returned.  The highest scoring paragraphs will be returned
first:
</p>

<eg role="xquery" xml:space="preserve">
for $p score $s in //book[title ftcontains "software"]/para[. ftcontains "usability"]
     order by $s descending
  return $p
</eg>

<p>The following more elaborate example uses multiple score variables to
return the matching paragraphs ordered so that those from the highest scoring
books precede those from the lowest scoring books, where the highest scoring
paragraphs of each book are returned before the lower scoring paragraphs of
that book:
</p>
<eg role="xquery" xml:space="preserve">
for $b score $score1 in //book[title ftcontains "software"]
    order by $score1 descending
return
    for $p score $score2 in $b/para[. ftcontains "usability"]
       order by $score2 descending
    return $p
</eg>

<p>The <code>score</code> variable is bound to a value which reflects
the relevance of the match criteria in the 
full-text selections to the nodes in the respective RangeExprs. The
calculation of relevance is <termref def="dt-implementation-dependent">implementation-dependent</termref>, but score
evaluation must follow these rules:</p>

<olist>
<item><p>Score values are of type xs:double in the range
[0, 1].</p></item> 
<item><p>For score values greater than 0, a higher score must imply a
higher degree of relevance </p></item>
</olist>

<p>Similarly to their use in a <code>for</code> clause, score variables
may be specified in a <code>let</code> clause. A score variable in a
<code>let</code> clause is also bound to the score of the expression
evaluation, but in the <code>let</code> clause one score is determined
for the complete result. The <code>let</code> variable may be dropped
from the <code>let</code> clause, if the
<code>score</code> variable is present.</p>

<p>The production for the extended <code>let</code> clause follows.


<scrap headstyle="show">
<head/>
<prod num="38" id="doc-xquery-LetClause"><lhs>LetClause</lhs><rhs>(("let"  "$"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-VarName" xlink:type="simple">VarName</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-TypeDeclaration" xlink:type="simple">TypeDeclaration</nt>?)  |  ("let"  "score"  "$"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-VarName" xlink:type="simple">VarName</nt>))  ":="  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-ExprSingle" xlink:type="simple">ExprSingle</nt>  (","  (("$"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-VarName" xlink:type="simple">VarName</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-TypeDeclaration" xlink:type="simple">TypeDeclaration</nt>?)  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTScoreVar" xlink:type="simple">FTScoreVar</nt>)  ":="  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-ExprSingle" xlink:type="simple">ExprSingle</nt>)*</rhs></prod>
</scrap> 
</p>

<p>While when using the score option in a <code>for</code> clause the
expression following the <code>in</code> keyword has the dual purpose
of filtering, i.e., driving the iteration, and determining the scores,
it is possible to separately specify expressions for filtering and
scoring by combining a simple <code>for</code> clause with a
<code>let</code> clause that uses scoring. The following is 
an example of this.

		<eg role="xquery" xml:space="preserve">
for $b in /books/book[.//chapter/title ftcontains "testing"]
let score $s := $b/content ftcontains "web site" ftand "usability" 
order by $s descending
return &lt;result score="{$s}"&gt;{$b}&lt;/result&gt;
</eg>
This example returns <code>book</code> elements with chapter titles that contain "testing". Along with the <code>book</code> elements scores are returned. These scores, however, reflect whether the book content contains "web site" and "usability".</p>

<p>Note that it is not a requirement of the score of an 
FTContainsExpr to be 0, if the expression evaluates to false, nor to
be non-zero, if the expression evaluates to true.
Hence, in the example above it is not possible to infer the Boolean
value of the FTContainsExpr in the <code>let</code> clause from the
calculated score of a returned <code>result</code> element. For instance, an
implementation may want to assign a non-zero score to a book that
contained only "web site", but not "usability", as this may be
considered more relevant than a book that does not contain either of
both.
</p>


<p>
The expression ExprSingle assigned to the score variable is passed to
the scoring algorithm and is not evaluated directly. The scoring
algorithm calculates the score value based on the passed expression
(not on the value returned by evaluating the expression). The set of
supported expressions is <termref def="dt-implementation-defined">implementation-defined</termref>. 
</p>

<p>The use of <code>score</code> variables introduces a second-order
aspect to the evaluation of expressions which cannot be emulated by
(first-order) XQuery functions. Consider the following replacement of
the clause <code>let score $s := FTContainsExpr</code></p>

		<eg xml:space="preserve">
let $s := score(FTContainsExpr)
</eg>

<p>where a function <code>score</code> is applied to some
FTContainsExpr. If the function <code>score</code> were first-order, it
would only be applied to the result of the evaluation of 
its argument, which is one of the Boolean constants <code>true</code>
or <code>false</code>. Hence, there would be at most two possible
values such a <code>score</code> function would be able to return and
no further differentiation would be possible. </p>


   <div3 id="section-using-weights">
      <head>Using Weights Within a Scored FTContainsExpr</head>

<p><termdef id="WeightDeclarationsDef" term="WeightDeclarations">Scoring may be influenced by adding <term>weight declarations</term> to search tokens, phrases, and expressions.</termdef>
Syntactically weight declarations are introduced in the FTSelection
production, described in Section <specref ref="ftselections"/>.
</p>

<p>The effect of weights on the result score is
<termref def="dt-implementation-dependent">implementation-dependent</termref>. However, weight declarations must follow
these rules:</p> 
<olist>
<item><p>Weights in an FTContainsExpr are significant only in relation
to each other; and</p></item>
<item><p>When no explicit weight is specified, the default weight is
1.0.</p></item>
<item><p>The weight must be between 0.0 and 1000.0 inclusive.</p></item>
</olist>

<p>
Weight declarations in an FTContainsExpr for which no scores are
evaluated are ignored. 
</p>

<p>The following example illustrates how different weights can be used
for different search terms.
		<eg role="xquery" xml:space="preserve">
for $b in /books/book
let score $s := $b/content ftcontains ("web site" weight 0.5)
                                ftand ("usability" weight 2)
return &lt;result score="{$s}"&gt;{$b}&lt;/result&gt;
</eg>
</p>

   </div3>
	</div2>


   <div2 id="section-extensions-static-context">
      <head>Extensions to the Static Context</head>
<p>
The XQuery Static Context is extended by a component for each of the
full-text match options. Thus, the default of a match option in a
query may be changed by providing a setting in the static context using the
following declaration syntax.
<scrap headstyle="show"><head/>
	<prod num="6" id="doc-xquery-Prolog"><lhs>Prolog</lhs><rhs>((<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-DefaultNamespaceDecl" xlink:type="simple">DefaultNamespaceDecl</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-Setter" xlink:type="simple">Setter</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-NamespaceDecl" xlink:type="simple">NamespaceDecl</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-Import" xlink:type="simple">Import</nt>)  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-Separator" xlink:type="simple">Separator</nt>)*  ((<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-VarDecl" xlink:type="simple">VarDecl</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-FunctionDecl" xlink:type="simple">FunctionDecl</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-OptionDecl" xlink:type="simple">OptionDecl</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTOptionDecl" xlink:type="simple">FTOptionDecl</nt>)  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-Separator" xlink:type="simple">Separator</nt>)*</rhs></prod>
	<prod num="14" id="doc-xquery-FTOptionDecl"><lhs>FTOptionDecl</lhs><rhs>"declare"  "ft-option"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMatchOptions" xlink:type="simple">FTMatchOptions</nt></rhs></prod>
</scrap>
Match options modify the match semantics of full-text
expressions. They are described in detail in  
Section <specref ref="ftmatchoptions"/>. When a match
option is specified explicitly in a query, that
setting overrides the setting of the respective match option in the
static context.
</p>



   </div2>
</div1>


<div1 id="ftselections">
	<head>Full-Text Selections</head>

<p>This section describes the
full-text selections which contain the full-text
operators in a <termref def="dt-ftcontains">full-text contains
expression</termref>  
(<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTContainsExpr" xlink:type="simple">FTContainsExpr</nt>), as 
well as the match options which modify the matching semantics of the 
full-text selections. In the following the syntax for each type of
full-text selection is given together with an informal statement of
its meaning.</p>

<p><termdef id="ftselection" term="full-text selection">A 
<term>full-text selection</term> specifies the possible
full-text search conditions.
</termdef></p>

<scrap headstyle="show">
<head/>
<prod num="144" id="doc-xquery-FTSelection"><lhs>FTSelection</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTOr" xlink:type="simple">FTOr</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTPosFilter" xlink:type="simple">FTPosFilter</nt>*  ("weight"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-RangeExpr" xlink:type="simple">RangeExpr</nt>)?</rhs></prod>
</scrap>

<p>As shown in the grammar, a full-text selection consists of search 
conditions possibly involving logical operators (<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTOr" xlink:type="simple">FTOr</nt>) followed by an 
arbitrary number of positional filters (<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTPosFilter" xlink:type="simple">FTPosFilter</nt>)
optionally followed by a "weight" value which is specified using a 
 range expression.
The RangeExpr is evaluated, as if it were an argument to a function 
with an expected type "xs:double"; it must
be between 0.0 and 1000.0 inclusive.</p>

<p>The syntax and semantics of the individual full-text selection
operators follow.</p>


<p>This XML document fragment
is the source document for examples in this section. </p>

 <eg xml:space="preserve">&lt;book number="1"&gt;
  &lt;title shortTitle="Improving Web Site Usability"&gt;Improving  
      the Usability of a Web Site Through Expert Reviews and
      Usability Testing&lt;/title&gt;
   &lt;author&gt;Millicent Marigold&lt;/author&gt;
   &lt;author&gt;Montana Marigold&lt;/author&gt;
   &lt;editor&gt;Véra Tudor-Medina&lt;/editor&gt;
   &lt;content&gt;
     &lt;p&gt;The usability of a Web site is how well the  
         site supports the users in achieving specified  
         goals. A Web site should facilitate learning,  
         and enable efficient and effective task  
         completion, while propagating few errors.
     &lt;/p&gt;
     &lt;note&gt;This book has been approved by the Web Site  
         Users Association.
     &lt;/note&gt;
   &lt;/content&gt;
 &lt;/book&gt;</eg>

<p>Tokenization is <termref def="dt-implementation-defined">implementation-defined</termref>. A sample tokenization is
used for the examples in this section. 
This sample tokenization uses white space, punctuation and XML tags as word-breakers and 
<code>&lt;p&gt;</code> for paragraph boundaries. The results may be different
for other tokenizations.</p>  
 <p>The first five tokens in this example using the sample tokenization would be "Improving", "the", "usability", "of", and "a".</p>

<p>Unless stated otherwise, the results
assume a case-insensitive match.</p>

<div2 id="ftprimary">
	<head>Primary Full-Text Selections</head>

<scrap headstyle="show">
<head/>
<prod num="150" id="doc-xquery-FTPrimary"><lhs>FTPrimary</lhs><rhs>(<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTTimes" xlink:type="simple">FTTimes</nt>?)  |  ("("  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTSelection" xlink:type="simple">FTSelection</nt>  ")")  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTExtensionSelection" xlink:type="simple">FTExtensionSelection</nt></rhs></prod>
</scrap>

<p><termdef id="dt-ftprimary" term="primary full-text selection">A 
<term>primary full-text selection</term> is the basic form of a 
full-text selection. It specifies words and phrases as search 
conditions (<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt>), optionally followed by a cardinality constraint 
(<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTTimes" xlink:type="simple">FTTimes</nt>). An <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTSelection" xlink:type="simple">FTSelection</nt> 
in parentheses is also a primary full-text selection.</termdef>
</p>

</div2>


<div2 id="ftwords">
	<head>Search Tokens and Phrases</head>

<scrap headstyle="show">
<head/>
<prod num="151" id="doc-xquery-FTWords"><lhs>FTWords</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWordsValue" xlink:type="simple">FTWordsValue</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt>?</rhs></prod>
<prod num="152" id="doc-xquery-FTWordsValue"><lhs>FTWordsValue</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-Literal" xlink:type="simple">Literal</nt>  |  ("{"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-Expr" xlink:type="simple">Expr</nt>  "}")</rhs></prod> 
<prod num="154" id="doc-xquery-FTAnyallOption"><lhs>FTAnyallOption</lhs><rhs>("any"  "word"?)  |  ("all"  "words"?)  |  "phrase"</rhs></prod>
</scrap>

<p><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt> finds matches that contain the specified 
tokens and phrases.</p>

<p>FTWords consists of two parts: a mandatory <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWordsValue" xlink:type="simple">
FTWordsValue</nt> part and an optional <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">
FTAnyallOption</nt> part. <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWordsValue" xlink:type="simple">FTWordsValue</nt> specifies the tokens and phrases
that must be contained in the matches. <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt> specifies how 
containment is checked. </p>

<p>The <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWordsValue" xlink:type="simple">FTWordsValue</nt> is converted as though it were an argument to a
function with the expected type of "xs:string*".
</p>

<p>In general, the tokens and phrases in <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWordsValue" xlink:type="simple">
FTWordsValue</nt> are specified using a nested XQuery expression. 
To simplify notation, the enclosing braces may be omitted if <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWordsValue" xlink:type="simple">FTWordsValue</nt> consists of a single literal.
</p>

<p>The following rules specify how the containment of the strings from the
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWordsValue" xlink:type="simple">FTWordsValue</nt> sequence is checked. First,
every string is tokenized into a sequence of tokens as
described in <loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#TokenizationSec" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">Section 4.1 Tokenization</loc>.
Then, <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt> is checked.</p>

<p>If <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt> is "any", the sequence of tokens for every string is
considered as a phrase, i.e. the tokens must occur consecutively in the
text in the specified order. If the sequence contains more than one string, 
the different strings are considered to be alternatives, i.e.  the resulting 
matches must contain at least one of the generated phrases.</p>

<p>If <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt> is "all", the sequence of tokens for every string is
considered as a phrase. The resulting matches must contain all of the 
generated phrases.</p>

<p>If <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt> is "phrase", the tokens from all the strings are
concatenated in a single sequence, which is considered as a phrase. The
resulting matches must contain the generated phrase.</p>

<p>If <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt> is "any word", the tokens from all the strings are
combined into a single set. The resulting matches must contain at least
one of the tokens in the set.</p>

<p>If <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt> is "all words", the tokens from all the strings are
combined into a single set. The resulting matches must contain all
of the tokens in the set.</p>

<p>If the <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWordsValue" xlink:type="simple">FTWordsValue</nt> evaluates to
a single string, the use of "any", "all", and "phrase" in
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOption</nt> produces the same
results.</p>

<p>If <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnyallOption" xlink:type="simple">FTAnyallOptions</nt> is omitted, "any" is 
the default.</p>

<p>The following expression returns the <code>book</code> element whose
<code>number</code> is 
1, because its <code>title</code> element contains the token "Expert":</p>
<eg role="xpath" xml:space="preserve">/book[@number="1" and ./title ftcontains "Expert"]</eg>

<p>The following expression returns the <code>book</code> element whose
<code>number</code> is 1, because its <code>title</code> element contains the
phrase "Expert Reviews":</p>
<eg role="xpath" xml:space="preserve">/book[@number="1" and ./title ftcontains "Expert Reviews"]</eg>

<p>The following expression returns the <code>book</code> element whose
<code>number</code> is 1, because its <code>title</code> element contains two
tokens "Expert" and "Reviews":</p>
<eg role="xpath" xml:space="preserve">/book[@number="1" and ./title ftcontains {"Expert",
"Reviews"} all]</eg>

<p>The following expression returns false, because the <code>p</code> element doesn't
contain the phrase "Web Site Usability" although it contains all of the tokens
in the phrase:</p>
<eg role="xpath" xml:space="preserve">/book[@number="1"]//p ftcontains "Web Site Usability"</eg> 

<p>The following expression returns book numbers of <code>book</code> elements by
"Marigold" with a title about "Web Site Usability", sorting them in descending
score order: </p> 
<eg role="xquery" xml:space="preserve">for $book in /book[.//author ftcontains "Marigold"] 
let score $score := $book/title ftcontains "Web Site Usability" 
where $score &gt; 0.8 
order by $score descending
return $book/@number</eg> 

</div2>


<div2 id="ftmatchoptions">
	<head>Match Options</head>


<p>Full-text match options modify the matching behaviour of 
the <termref def="dt-ftprimary">primary full-text selection</termref> to which 
they are applied. </p> 

<scrap headstyle="show"><head/>
	<prod num="149" id="doc-xquery-FTPrimaryWithOptions"><lhs>FTPrimaryWithOptions</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTPrimary" xlink:type="simple">FTPrimary</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMatchOptions" xlink:type="simple">FTMatchOptions</nt>?</rhs></prod>
	<prod num="165" id="doc-xquery-FTMatchOptions"><lhs>FTMatchOptions</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMatchOption" xlink:type="simple">FTMatchOption</nt>+</rhs></prod>
	<prod num="166" id="doc-xquery-FTMatchOption"><lhs>FTMatchOption</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTLanguageOption" xlink:type="simple">FTLanguageOption</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWildCardOption" xlink:type="simple">FTWildCardOption</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTThesaurusOption" xlink:type="simple">FTThesaurusOption</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTStemOption" xlink:type="simple">FTStemOption</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTCaseOption" xlink:type="simple">FTCaseOption</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTDiacriticsOption" xlink:type="simple">FTDiacriticsOption</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTStopwordOption" xlink:type="simple">FTStopwordOption</nt><br/>|  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTExtensionOption" xlink:type="simple">FTExtensionOption</nt></rhs></prod>
</scrap>

<p><termdef id="dt-match-options" term="match option"><term>Match options</term>  modify the set of tokens
      in the query, or how they are matched against tokens in the
      text.</termdef> 
</p>
<p><termdef id="dt-match-option-group" term="match option group">
Each of the seven alternatives of production 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMatchOption" xlink:type="simple">FTMatchOption</nt>
corresponds to one <term>match option group</term>. </termdef>
The match options from any given group are mutually exclusive, i.e., 
only one of these settings can be in effect, whereas match options of
different groups can be combined freely.</p>

<p>
Note that, along with the syntax rules above, there is an extra-grammatical 
constraint,
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#parse-note-multiple-match-options" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">multiple-match-options
      </loc>,
which needs to be considered, if multiple match options are specified.
It states that within a single <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMatchOptions" xlink:type="simple">FTMatchOptions</nt>
at most one match option of any given 
<termref def="dt-match-option-group">match option group</termref> may
be specified. 
For example, if the <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTCaseOption" xlink:type="simple">FTCaseOption</nt> "lowercase" 
is specified, then "uppercase" cannot also be specified as part of the same 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMatchOptions" xlink:type="simple">FTMatchOptions</nt>.
</p>

<p>Although match options only take effect in the application of 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt>, the syntax also allows to specify 
match options that modify the non-primitive full-text selection 
<code>"(" FTSelection ")"</code>. Such a higher-level match option
provides a default for the respective match option group for any
embedded <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTPrimary" xlink:type="simple">FTPrimary</nt>, just as the static
context components corresponding to the
<termref def="dt-match-option-group">match option groups</termref>
provide default match options for the whole query. Details about
these context components, including their default values, are given in
Appendix <specref ref="id-xqft-static-context-components"/>.</p>

<p>In other words, there is a tuple of seven effective match options,
one from each group, which are propagated from top to bottom in the
query syntax tree. For the top-level query the seven values are given
by the static context and at each <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTPrimary" xlink:type="simple">FTPrimary</nt>
the locally (like postfix operators) specified match options may
override these propagated values.
Thus, any occurrence of an <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt> in a
query is associated with seven effective match options, one from each
group, that influence its matching.</p> 


<p>The order in which effective match options for an <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt> are applied is subject to some constraints:
<olist>
<item><p>The Language Option must be applied first</p></item>
<item><p>The Stemming Option must be applied before the Case Option and the
Diacritics Option</p></item>
</olist>
Aside from these constraints, the full order of the application of match
options is <termref def="dt-implementation-defined">implementation-defined</termref>.
<termdef id="dt-match-option-order" term="match option application order">
This order is called the <term>match option application order</term>.</termdef>
</p>

<p>
 More information on
their semantics is given in <specref ref="FTMatchOptionsSec"/>.</p>

<p>If no match options declarations are present in the prolog and the
implementation does not define any overwriting of the static context
components for the match options, the query:</p> 

<eg role="xpath" xml:space="preserve">/book/title ftcontains "usability" </eg>

<p>is, assuming "de" is the <termref def="dt-implementation-defined">implementation-defined</termref> default language,
equivalent to the query:</p>

<eg role="xpath" xml:space="preserve">/book/title ftcontains "usability" case insensitive 
    diacritics insensitive 
    without stemming without thesaurus  
    without stop words language "de" without wildcards</eg>


<p> We describe each match option group in more detail in the following
sections.</p>



<div3 id="ftcaseoption">
	<head>Case Option</head>

<scrap headstyle="show"><head/>
			<prod num="167" id="doc-xquery-FTCaseOption"><lhs>FTCaseOption</lhs><rhs>("case"  "insensitive")<br/>|  ("case"  "sensitive")<br/>|  "lowercase"<br/>|  "uppercase"</rhs></prod>
</scrap>

<p><termdef id="dt-ftcaseoption" term="case option">A <term>case option</term>
modifies the matching of tokens and phrases by specifying how uppercase and 
lowercase characters are considered.</termdef>
</p>


<p>There are four possible character case options:</p>

<olist>
<item><p> Using the option "case insensitive" tokens and phrases are matched,
regardless of the case of characters of the query tokens and phrases.</p></item>

<item><p> Using the option "case sensitive" tokens and phrases are matched,
if and only if the case of their characters is the same as written in the
query.</p></item>

<item><p> Using the option "lowercase" tokens and phrases are matched, if
and only if they match the query without regard to character case, but contain 
only lowercase characters.</p></item>

<item><p> Using the option "uppercase" tokens and phrases are matched, if
and only if they match the query without regard to character case, but contain 
only uppercase characters.</p></item>

</olist>

<p>The default is "case insensitive". </p>

<p>The following table summarizes the interactions between 
the case match options and the use of the default collation.</p>

<p>
 <table border="1">
      <caption>Case Matrix</caption>
      <thead>
       <tr>
        <td rowspan="1" colspan="1">Default collation options/Case options</td>
        <td rowspan="1" colspan="1">UCC (Unicode Codepoint Collation)</td>
        <td rowspan="1" colspan="1">CCS (some generic case-sensitive collation)</td>
        <td rowspan="1" colspan="1">CCI (some generic case-insensitive collation) </td>
       </tr>
      </thead>
      <tbody>
       <tr>
        <td rowspan="1" colspan="1">insensitive</td>
        <td rowspan="1" colspan="1">compare as if both lower</td>
        <td rowspan="1" colspan="1">case-insensitive variant of CCS if it exists, else error</td>
        <td rowspan="1" colspan="1">CCI</td>
       </tr>
       <tr>
        <td rowspan="1" colspan="1">sensitive</td>
        <td rowspan="1" colspan="1">UCC</td>
        <td rowspan="1" colspan="1">CCS</td>
        <td rowspan="1" colspan="1">case-sensitive variant of CCI if it exists, else error</td>
       </tr>
       <tr>
        <td rowspan="1" colspan="1">lowercase</td>
        <td rowspan="1" colspan="1">lowercase(Expr) + UCC</td>
        <td rowspan="1" colspan="1">lowercase(Expr) + CCS</td>
        <td rowspan="1" colspan="1">CCI</td>
       </tr>
       <tr>
        <td rowspan="1" colspan="1">uppercase</td>
        <td rowspan="1" colspan="1">uppercase(Expr) + UCC</td>
        <td rowspan="1" colspan="1">uppercase(Expr) + CCS</td>
        <td rowspan="1" colspan="1">CCI</td>
       </tr>
      </tbody>
     </table>
</p>

<note><p>In this table, "else error" means "Otherwise, an error
is raised: <xerrorref spec="FO" class="CH" code="0002" type="dynamic"/>". 
The phrase "if it exists" is used, because
the case-sensitive collation CCS does not always have a
case-insensitive variant (and, even if one exists, it may not be
possible to determine it algorithmically), and because the
case-insensitive collation CCI does not always have a case-sensitive
variant (and, even if one exists, it may not be possible to determine
it algorithmically).</p></note>

<note><p>Using the "lowercase" (respectively "uppercase") option is equivalent
to using the option "case sensitive", while converting the query strings to 
their lowercase (respectively uppercase) form before matching.
</p></note>

<p>The following expression returns false, because the <code>title</code> element
doesn't contain "usability" in lower-case characters:</p>

<eg role="xpath" xml:space="preserve">/book[@number="1"]/title ftcontains "Usability" lowercase </eg>

<p>The following expression returns true, because the character case is not
considered:</p>

<eg role="xpath" xml:space="preserve">/book[@number="1"]/title ftcontains "usability" 
case insensitive 
</eg>


</div3>

<div3 id="ftdiacriticsoption">
	<head>Diacritics Option</head>

<scrap headstyle="show"><head/>
	<prod num="168" id="doc-xquery-FTDiacriticsOption"><lhs>FTDiacriticsOption</lhs><rhs>("diacritics"  "insensitive")<br/>|  ("diacritics"  "sensitive")</rhs></prod>
</scrap>

<p><termdef id="dt-ftdiacriticsoption" term="diacritics option">A 
<term>diacritics option</term>
modifies token and phrase matching by specifying how diacritics are considered.
</termdef></p>

<p>There are two possible diacritics options:</p>

<olist>
<item><p>The option "diacritics" "insensitive" matches tokens and
phrases with and without diacritics. Whether diacritics are written in
the query or not is not considered.</p></item>

<item><p>The option "diacritics" "sensitive" matches tokens and phrases only
if they contain the diacritics as they are written in the query.</p></item>

</olist>

<p>The default is "diacritics insensitive". </p>

<p>The following table summarizes the interactions between the
diacritics match options and the use of the default collations.</p>

<p>
    <table border="1">
      <caption>Diacritics Matrix</caption>
      <thead>
       <tr>
        <td rowspan="1" colspan="1">Default collation options/Diacritics options</td>
        <td rowspan="1" colspan="1">UCC (Unicode Codepoint Collation)</td>
        <td rowspan="1" colspan="1">CDS (some generic diacritics-sensitive collation)</td>
        <td rowspan="1" colspan="1">CDI (some generic diacritics-insensitive collation) </td>
       </tr>
      </thead>
      <tbody>
       <tr>
        <td rowspan="1" colspan="1">insensitive</td>
        <td rowspan="1" colspan="1">UCC comparison, but without considering diacritics</td>
        <td rowspan="1" colspan="1">diacritics-insensitive variant of CDS
                                  if it exists, else error</td>
        <td rowspan="1" colspan="1">CDI</td>
       </tr>
       <tr>
        <td rowspan="1" colspan="1">sensitive</td>
        <td rowspan="1" colspan="1">UCC</td>
        <td rowspan="1" colspan="1">CDS</td>
        <td rowspan="1" colspan="1">diacritics-sensitive variant of CDI if it exists, else error</td>
       </tr>
      </tbody>
     </table>
</p>

<note><p>In this table, "else error" means "Otherwise, an error
is raised: <xerrorref spec="FO" class="CH" code="0002" type="dynamic"/>". 
The phrase "if it exists" is used, because
the diacritics-sensitive collation CDS does not always have a
diacritics-insensitive variant (and, even if one exists, it may not be
possible to determine it algorithmically), and because the
diacritics-insensitive collation CDI does not always have a
diacritics-sensitive variant (and, even if one exists, it may not be
possible to determine it algorithmically).</p></note>

<p>The following expression returns true, because the token "Véra" in the
<code>editor</code> element is matched, as the acute accent is not 
considered in the comparison:</p>

<eg role="xpath" xml:space="preserve">/book[@number="1"]//editor ftcontains "Vera" diacritics insensitive</eg>

<p>This returns false, because the <code>editor</code> element does not
contain the token "Vera" in this exact form, i.e. without any diacritics:</p>

<eg role="xpath" xml:space="preserve">/book[@number="1"]/editors ftcontains "Vera" diacritics sensitive</eg>


</div3>
<!--<div3 id="ftspecialcharoption">
	<head>FTSpecialCharOption</head>

<scrap><head></head>
			<prodrecap ref="FTSpecialcharOption"/>
</scrap>

<p><nt def="FTSpecialCharOption">FTSpecialCharOption</nt>
specifies whether special characters such as punctuation should or
should not be ignored. </p>

<p>Influences the way <nt def="FTWords">FTWords</nt> is
applied. </p>

<p>The option "with special characters" specifies that special
characters such as punctuation must also be matched. The option
"without special characters" specifies that special characters such as
punctuation need not be matched.
</p>

<p>The default is "without special characters". </p>


<eg role="xpath">/book[@number="1"]//editor ftcontains "Tudor Medina" with 
special characters </eg> 

<p>returns true.</p>

<eg role="xpath">/book[@number="1"]/editors ftcontains "Tudor-Medina" without
special characters </eg> 

<p>returns false.</p>


</div3>
-->

<div3 id="ftstemoption">
	<head>Stemming Option</head>

<scrap headstyle="show"><head/>
			<prod num="169" id="doc-xquery-FTStemOption"><lhs>FTStemOption</lhs><rhs>("with"  "stemming")  |  ("without"  "stemming")</rhs></prod>
</scrap>

<p><termdef id="dt-ftstemoption" term="stemming option">A <term>stemming option</term>
modifies token and
phrase matching by specifying whether stemming is applied or not.
</termdef></p>

<p>The "with stemming" option specifies that matches may contain tokens
that have the same stem as the tokens and phrases written in the
query. It is <termref def="dt-implementation-defined">implementation-defined</termref> what a stem of a token is. </p>

<p>The "without stemming" option specifies that the tokens and
phrases are not stemmed. </p>

<p>It is <termref def="dt-implementation-defined">implementation-defined</termref> whether the stemming is based on an
algorithm, dictionary, or mixed approach. </p>

<p>The default is "without stemming". </p>


<p>The following expression returns true, because the <code>title</code> of the specified
<code>book</code> contains "improving" which has the same stem as
"improve":</p>

<eg role="xpath" xml:space="preserve">/book[@number="1"]/title ftcontains "improve" with stemming </eg>


</div3>
<div3 id="ftthesaurusoption">
	<head>Thesaurus Option</head>

<scrap headstyle="show"><head/>
	<prod num="170" id="doc-xquery-FTThesaurusOption"><lhs>FTThesaurusOption</lhs><rhs>("with"  "thesaurus"  (<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTThesaurusID" xlink:type="simple">FTThesaurusID</nt>  |  "default"))<br/>|  ("with"  "thesaurus"  "("  (<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTThesaurusID" xlink:type="simple">FTThesaurusID</nt>  |  "default")  (","  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTThesaurusID" xlink:type="simple">FTThesaurusID</nt>)*  ")")<br/>|  ("without"  "thesaurus")</rhs></prod>
	<prod num="171" id="doc-xquery-FTThesaurusID"><lhs>FTThesaurusID</lhs><rhs>"at"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-URILiteral" xlink:type="simple">URILiteral</nt>  ("relationship"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-StringLiteral" xlink:type="simple">StringLiteral</nt>)?  (<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTRange" xlink:type="simple">FTRange</nt>  "levels")?</rhs></prod>
	<prod num="143" id="doc-xquery-URILiteral"><lhs>URILiteral</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-StringLiteral" xlink:type="simple">StringLiteral</nt></rhs></prod>
</scrap>

<p><termdef id="dt-ftthesaurusoption" term="thesaurus option">A 
<term>thesaurus option</term>
modifies
token and phrase matching by specifying whether a thesaurus is used or
not.</termdef>
 If thesauri are used, the thesaurus option specifies information to locate 
the thesauri either by default or through a URI
reference. It also states the relationship to be applied and how many
levels within the thesaurus to be traversed.</p>

<p>The value of the <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTThesaurusID" xlink:type="simple">FTThesaurusID</nt>
must be a <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-URILiteral" xlink:type="simple">URILiteral</nt>.
</p>

<p>Thesauri add related tokens and phrases to the search.  Thus, the
user may narrow, broaden, or otherwise modify the search using
synonyms, hypernyms (more generic terms), etc. The search is performed
as though the user has specified all related search tokens and phrases
in a disjunction (FTOr). </p>

<note><p>A thesaurus may be standards-based or locally-defined. It may be a
traditional thesaurus, or a taxonomy, soundex, ontology, or topic
map. How the thesaurus is represented is <termref def="dt-implementation-dependent">implementation-dependent</termref>.</p>
</note> 

<p>FTThesaurusID specifies the relationship sought between tokens and
phrases written in the query and terms in the thesaurus and the number
of levels to be queried in hierarchical relationships by including an
FTRange "levels". If no levels are specified, the default is to query
all levels in hierarchical relationships.</p>

<p>Relationships include, but are not limited to, the relationships
and their abbreviations presented in <bibref ref="iso-2788"/> and
their equivalents in other languages. The set of relationships supported by an
implementation is <termref def="dt-implementation-defined">implementation-defined</termref>, but
implementations <termref def="should">SHOULD</termref> support the relationships
defined in <bibref ref="iso-2788"/>. The following list of terms have the
meanings 
defined in <bibref ref="iso-2788"/>. If a query specifies thesaurus
relationships or levels not supported by the thesaurus, 
the behavior is <termref def="dt-implementation-defined">implementation-defined</termref>.
</p>
<olist>
<item><p> <emph>equivalence relationships (synoymns):</emph> PREFERRED TERM (USE), 
NONPREFERRED USED FOR TERM (UF);</p></item>
<item><p> <emph>hierarchical relationships:</emph> BROADER TERM (BT), 
NARROWER TERM (NT),  BROADER TERM GENERIC (BTG), NARROWER TERM GENERIC (NTG), 
BROADER TERM PARTITIVE (BTP), NARROWER TERM PARTITIVE (NTP), 
TOP Terms (TT); and</p></item> 
<item><p> <emph>associative relationships:</emph> RELATED TERM (RT).</p></item>
</olist>

<p>The "with thesaurus" option specifies that string matches include
tokens that can be found in one of the specified thesauri. </p>

<p>The "without thesaurus" option specifies that no thesaurus will be
used. </p>

<p>The "with default thesaurus" option specifies that a system-defined
default thesaurus with a system-defined relationship is used. The
default thesaurus may be used in combination with other explicitly
specified thesauri.</p>

<p>The default is "without thesaurus". </p>

<p>The following expression returns true, because it finds a <code>content</code>
element containing "tasks" which the thesaurus identified as a synonym for
"duties":</p>

<eg role="xpath" xml:space="preserve">count(.//book/content ftcontains "duties" with
thesaurus at "http://bstore1.example.com/UsabilityThesaurus.xml"
relationship "UF")&gt;0</eg>

<p>The following expression returns <code>book</code> elements, because it finds a
<code>content</code> element containing "web site components", and
narrower terms "navigation" and "layout":</p>

<eg role="xpath" xml:space="preserve">doc("http://bstore1.example.com/full-text.xml")
/books/book[count(./content ftcontains "web site components" with
thesaurus at "http://bstore1.example.com/UsabilityThesaurus.xml"
relationship "NT" at most 2 levels)&gt;0]</eg>

<p>Assuming that there is a locally
defined thesaurus that contains soundex capabilities, the following query
returns a <code>book</code> element containing "Marigold" which
sounds which sound like "Merrygould":</p>

<eg role="xpath" xml:space="preserve">doc("http://bstore1.example.com/full-text.xml")
/books/book[count(. ftcontains "Merrygould" with thesaurus at
"http://bstore1.example.com/UsabilitySoundex.xml" relationship
"sounds like")&gt;0]</eg>

<!--
<p>The following expression returns the true if "Synonyms" is a thesaurus for synonyms 
in the English language:</p>
<eg role="xpath">/book[@number="1"]//p ftcontains "buttress" with
thesaurus "Synonyms"</eg>
-->


</div3>
<div3 id="ftstopwordoption">
	<head>Stop Word Option</head>

<scrap headstyle="show"><head/>
	<prod num="172" id="doc-xquery-FTStopwordOption"><lhs>FTStopwordOption</lhs><rhs>("with"  "stop"  "words"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTRefOrList" xlink:type="simple">FTRefOrList</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTInclExclStringLiteral" xlink:type="simple">FTInclExclStringLiteral</nt>*)<br/>|  ("without"  "stop"  "words")<br/>|  ("with"  "default"  "stop"  "words"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTInclExclStringLiteral" xlink:type="simple">FTInclExclStringLiteral</nt>*)</rhs></prod>
	<prod num="173" id="doc-xquery-FTRefOrList"><lhs>FTRefOrList</lhs><rhs>("at"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-URILiteral" xlink:type="simple">URILiteral</nt>)<br/>|  ("("  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-StringLiteral" xlink:type="simple">StringLiteral</nt>  (","  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-StringLiteral" xlink:type="simple">StringLiteral</nt>)*  ")")</rhs></prod>
	<prod num="174" id="doc-xquery-FTInclExclStringLiteral"><lhs>FTInclExclStringLiteral</lhs><rhs>("union"  |  "except")  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTRefOrList" xlink:type="simple">FTRefOrList</nt></rhs></prod>
</scrap>

<p><termdef id="dt-ftstopwordoption" term="stop word option">A 
<term>stop word option</term>
controls word matching by specifying whether stop words are used or not. 
Stop words are tokens in the query that match any token in the text. 
</termdef>
Normally a stop word matches 
exactly one token, but there may be <termref def="dt-implementation-defined">implementation-defined</termref> conditions, under
which a stop word may match a different number of tokens.</p>

<p><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTRefOrList" xlink:type="simple">FTRefOrList</nt> specifies the list
of stop words either explicitly as a comma-separated list of string
literals, or by the keyword <code>at</code> followed by a literal URI.
If the URI specifies a list of stop words that is not found in the statically
known stop word lists, an error is raised <errorref class="ST" code="0008"/>. 
Whether the stop word
list is resolved from the statically known stop word lists or given explicitly,
no tokenization is performed on the stop words: they are used as they occur  
in the sequence.
</p>

<p>The "with stop words" option specifies that if a token is within the
specified collection of stop words, it is removed from the search and
any token may be substituted for it. Stop words retain their position
numbers and are counted in <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTDistance" xlink:type="simple">FTDistance</nt>
and <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWindow" xlink:type="simple">FTWindow</nt> searches.</p>

<p>Multiple stop word lists may be combined using "union" or "except".
The keywords "union" and "except" are applied from left to right. If "union" is specified, every string occurring in the lists  
specified by the left-hand side or the right-hand side is a stop 
word. If "except" is specified, only strings occurring in the list  
specified by the left-hand side but not in the list specified
by the right-hand side are stop words. </p>

<p>The "with default stop words" option specifies that an
<termref def="dt-implementation-defined">implementation-defined</termref> collection of stop words is used. </p>

<p>The "without stop words" option specifies that no stop words are
used. This is equivalent to specifying an empty list of stop
words.</p>

<p>The default is "without stop words". </p>

<note><p>Stop word lists may be applied during indexing. If applied during indexing asking for stop words to not be used during a query, will have no effect.</p></note>

<p>The following expression returns true, because the document contains the phrase
"propagating few errors":</p>

<eg role="xpath" xml:space="preserve">/book[@number="1"]//p ftcontains "propagation of errors"
with stemming with stop words ("a", "the", "of") </eg>

<p>Note the asymmetry in the stop word semantics: the property of
being a stop word is only relevant to query terms, not to document
terms. Hence, it is irrelevant for the above-mentioned match whether
"few" is a stop word or not, and on the other hand we do not want the
query above to match "propagation" followed by 2 stop words, or even a
sequence of 3 stop words in the document.</p>

<p>The following expression returns false, because "of" is not in the <code>p</code>
element between "propagating" and "errors":</p>

<eg role="xpath" xml:space="preserve">/book[@number="1"]//p ftcontains "propagation of errors" 
with stemming without stop words</eg> 

<p>The following expression uses the stop words list specified at the
URL. Assuming that the specified stop word list contains the "then", this query
is reduced to a query on the phrase "planning X conducting", allowing any
token as a substitute for X.  It returns a <code>book</code> element,
because its <code>content</code> element contains "planning then
conducting". It would also return the <code>book</code> if the
phrases "planning and conducting" and "planning before conducting"
had been in its <code>content</code>:</p>

<eg role="xpath" xml:space="preserve">
doc("http://bstore1.example.com/full-text.xml")
/books/book[count(.//content ftcontains "planning then 
conducting" with stop words at 
"http://bstore1.example.com/StopWordList.xml")&gt;0]
</eg>

<p>The following expression returns <code>book</code>s containing "planning then
conducting", but not does not return <code>book</code>s containing "planning
and conducting", since it is exempting "then" from being a stop word: </p>

<eg role="xpath" xml:space="preserve">
doc("http://bstore1.example.com/full-text.xml")
/books/book[count(.//content ftcontains "planning then conducting"
with stop words at "http://bstore1.example.com/StopWordList.xml"
except ("the then"))&gt;0]
</eg>

</div3>

<div3 id="ftlanguageoption">
	<head>Language Option</head>

<scrap headstyle="show"><head/>
	<prod num="175" id="doc-xquery-FTLanguageOption"><lhs>FTLanguageOption</lhs><rhs>"language"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-StringLiteral" xlink:type="simple">StringLiteral</nt></rhs></prod>
</scrap>

<p><termdef id="dt-ftlanguageoption" term="language option">A 
<term>language option</term> 
modifies token matching by specifying the language of search tokens and 
phrases.</termdef></p>

<p>The StringLiteral following the keyword <code>language</code>
designates one language. It must be castable to "xs:language"; otherwise, an
error is raised: <xerrorref spec="XP" class="TY" code="0004" type="type"/>. </p> 

<p>The "language" option influences tokenization, stemming, and stop
words in an <termref def="dt-implementation-defined">implementation-defined</termref> way. The "language" option <termref def="may">MAY</termref> influence the behavior of other match options in an <termref def="dt-implementation-defined">implementation-defined</termref> way.</p>

<p>The set of standardized language identifiers are defined in <bibref ref="BCP47"/>.
The set of valid language identifiers among the standardized set is <termref def="dt-implementation-defined">implementation-defined</termref>. 
An implementation <termref def="may">MAY</termref> choose to use private extensions introduced by a
singleton 'x' for additional language identifiers, or other singletons
for registered extensions as described in sec. 2.2.6 of <bibref ref="BCP47"/>.
It is <termref def="dt-implementation-defined">implementation-defined</termref> what additional language identifiers, if any, are valid. 
If an invalid language identifier is specified, then the behavior is <termref def="dt-implementation-defined">implementation-defined</termref>. 
If the implementation chooses to raise an error in that case,
it must raise <errorref class="ST" code="0009"/>.
</p>

<p>The default language is specified in the static context. </p>

<!-- 2007-01-19 Jim: make effect of conflicting languages implementation-defined -->
<p>When an XQuery 1.0 and XPath 2.0 Full-Text processor evaluates text in a document
that is governed by an xml:lang attribute and
the portion of the full-text query doing that evaluation contains an FTLanguageOption that
specifies a different language that the language specified by the governing xml:lang attribute,
the language-related behavior of that full-text query is <termref def="dt-implementation-defined">implementation-defined</termref>. </p>

<p>This is an example where
the language option is used to select the appropriate stop word list: </p>
<eg role="xpath" xml:space="preserve">/book[@number="1"]//editor ftcontains "salon de the"
with default stop words language "fr"</eg> 

</div3>



<div3 id="ftwildcardoption">
	<head>Wildcard Option</head>

<scrap headstyle="show"><head/>
	<prod num="176" id="doc-xquery-FTWildCardOption"><lhs>FTWildCardOption</lhs><rhs>("with"  "wildcards")  |  ("without"  "wildcards")</rhs></prod>
</scrap>

<p><termdef id="dt-ftwildcardoption" term="wildcard option">A 
<term>wildcard option</term>
modifies token and phrase matching by specifying whether wildcards are used 
or not.</termdef></p>

<p>When the "with wildcards" option is used, wildcard indicators
(represented by periods (.)) and qualifiers may be appended to or
inserted into the query tokens. 
If the period is at the beginning of a query token, the wildcard is a prefix
wildcard. If the period is at the end of a query token, it is a suffix
wildcard. If the period is inserted into a query token, it is an infix
wildcard. 
</p>
<p>
Each indicator and qualifier in a query token
will match zero or more characters within a token in the text, as described
below. 
The number of characters matched depends on the qualifier. 
Qualifiers available are none, question mark, asterisk,
plus sign, and two numbers separated by a comma,
both enclosed by curly braces. </p>
 
<olist>

<item> 
<p>If a period is present, but there are no qualifiers, one character in the
text will match.
</p>
</item> 
 
<item> 
<p>If a period is followed by a question mark (.?), zero or one
characters in the text will match. </p>
</item> 

 
<item> 
<p>If a period is followed by an asterisk (.*), zero or more
characters will match.</p>
</item> 

 
<item> 
<p>If a period is followed by a plus sign (.+), one or more characters
will match. </p>
</item> 
 
 
<item> 
<p>If a period is followed by two numbers separated by a comma, both
enclosed by curly braces (.{n,m}), a specified range of characters
(at least n characters and no more than m characters)
will match.</p>
</item> 

</olist>

<p>When "with wildcards" is present and an indicator or qualifier character 
is intended to be taken literally (as itself), that character must be
preceded by ("escaped by") a backslash (\). 
For example, a period (.) that is intended to be a sentence terminator or
a decimal point must be preceded by a backslash so that it is not
interpreted to be an indicator. 
Similarly a question mark (?), asterisk (*), or plus sign (+) that is
intended to be interpreted as an ordinary text character must be preceded by
a backslash so that it is not interpreted to be an indicator. </p>
 
<p>The "without wildcards" option finds tokens without recognizing
wildcard indicators and qualifiers. 
Periods, question marks, asterisks, plus signs, and two numbers
separated by a comma, both enclosed by curly braces,
are always recognized as ordinary text characters.</p>

<p>The default is "without wildcards".</p>

<p>
Note: Wildcard indicators and qualifiers may be token boundaries. How text with
wildcard indicators and qualifiers is tokenized is implementation-defined.
</p>

<p>The expression returns true, because the <code>title</code> element
contains "improving":</p>
<eg role="xpath" xml:space="preserve">/book[@number="1"]/title ftcontains "improv.*" with
wildcards</eg>
 
<p>The following expression returns true, because the <code>title</code> element
contains "site":</p>
<eg role="xpath" xml:space="preserve">/book[@number="1"]/title ftcontains ".?site" with
wildcards</eg>

<p>The following expression returns true, because the <code>p</code> element
contains "well":</p>
<eg role="xpath" xml:space="preserve">/book[@number="1"]/p ftcontains "w.ll" with
wildcards</eg> 

<p>The following expression returns false, because the <code>p</code> element
does not contain "w.ll":</p>
<eg role="xpath" xml:space="preserve">/book[@number="1"]/p ftcontains "w.ll" without wildcards</eg> 


</div3>

<div3 id="ftextensionoption">
<head>Extension Option</head>

<p><termdef id="dt-ftextensionoption" term="extension option">An
<term>extension option</term> is a match option that acts in an
<termref def="dt-implementation-defined">implementation-defined</termref> way.
</termdef>
</p>

<scrap headstyle="show">
<head/>
    <prod num="177" id="doc-xquery-FTExtensionOption"><lhs>FTExtensionOption</lhs><rhs>"option"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-QName" xlink:type="simple">QName</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-StringLiteral" xlink:type="simple">StringLiteral</nt></rhs></prod>
</scrap>

<p>An extension option consists of an identifying QName and a StringLiteral.
Typically, a particular option will be recognized by some implementations and
not by others. The syntax is designed so that option declarations can be
successfully parsed by all implementations.
</p> 

<p>The QName of an option must resolve to a namespace URI and local name, using
the statically known namespaces.</p>

<note><p>There is no default namespace for options.</p></note>

<p>Each implementation recognizes an 
<termref def="dt-implementation-defined">implementation-defined</termref>
set of namespace
URIs used to denote extension options.</p>

<p>If the namespace part of the QName is not a namespace recognized by the
implementation as one used to denote extension option, then the extension option
is ignored.</p>

<p>Otherwise, the effect of the extension option, including its error behavior,
is <termref def="dt-implementation-defined">implementation-defined</termref>.
For example, if the local part of the QName is
not recognized, or if the StringLiteral does not conform to the rules
defined by the implementation for the particular extension option, the implementation may choose
whether to report an error, ignore the extension option, or take some
other action.</p>

<p>Implementations may impose rules on where particular extension options may
appear relative to other match options, and the
interpretation of an option declaration may depend on its position.</p>

<p>An extension option must not be used to change the syntax accepted by the
processor, or to suppress the detection of static errors. However, it may be
used without restriction to modify the set of tokens in the query or how they
are matched against tokens in the text. 
An extension option has the same scope as other match options.
</p>

<p>The following examples illustrate several possible uses for extension
options:</p>
<p>This extension option is set as part of the static context of all 
full-text expressions in the module and might be used to ensure that 
queries are insensitive to Arabic short-vowels.
</p>
<eg role="parse-test" xml:space="preserve">
declare namespace exq = "http://example.org/XQueryImplementation";

declare ft-option option exq:diacritics "short-vowel insensitive"
</eg>
<p>This extension option applies only to the matching in the full-text
selection in which it is found and might be used to specify how compound words
should be matched.
</p>
<eg role="parse-test" xml:space="preserve">
declare namespace exq = "http://example.org/XQueryImplementation";

//para[. ftcontains "Kinder" ftand "Platz" 
        distance 1 words with stemming option exq:compounds "distance=1"
</eg>

</div3>
</div2>
<div2 id="logical_ftoperators">
      <head>Logical Full-Text Operators</head>
      
<p>
Full-text selections can be combined with the logical connectives
<code>ftor</code> (full-text or), <code>ftand</code> (full-text and), <code>not in</code> (mild not),
and <code>ftnot</code> (unary full-text not).</p>

<scrap headstyle="show"><head/>
<prod num="145" id="doc-xquery-FTOr"><lhs>FTOr</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnd" xlink:type="simple">FTAnd</nt> ( "ftor"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTAnd" xlink:type="simple">FTAnd</nt> )*</rhs></prod>
<prod num="146" id="doc-xquery-FTAnd"><lhs>FTAnd</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMildNot" xlink:type="simple">FTMildNot</nt> ( "ftand"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMildNot" xlink:type="simple">FTMildNot</nt> )*</rhs></prod>
<prod num="147" id="doc-xquery-FTMildNot"><lhs>FTMildNot</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTUnaryNot" xlink:type="simple">FTUnaryNot</nt> ( "not"  "in"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTUnaryNot" xlink:type="simple">FTUnaryNot</nt> )*</rhs></prod>
<prod num="148" id="doc-xquery-FTUnaryNot"><lhs>FTUnaryNot</lhs><rhs>("ftnot")? <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTPrimaryWithOptions" xlink:type="simple">FTPrimaryWithOptions</nt></rhs></prod>
</scrap>
 
<div3 id="sec-ftor">
<head>Or-Selection</head>
<p><termdef id="dt-or-selection" term="or-selection">An
<term>or-selection</term> combines two full-text selections using the 
<code>ftor</code> operator.</termdef>
</p>
       
<p>An or-selection finds all matches that satisfy at least
one of the operand full-text selections. </p>

<p>The following expression returns the <code>book</code> element written by
"Millicent":</p> 
	
<eg role="xpath" xml:space="preserve"> /book[.//author ftcontains "Millicent" ftor
"Voltaire"] </eg>
</div3>

<div3 id="sec-ftand">
<head>And-Selection</head>
<p><termdef id="dt-and-selection" term="and-selection">An
<term>and-selection</term> combines two full-text selections using the 
<code>ftand</code> operator.</termdef>
</p>
       
<p>An and-selection finds matches that satisfy all of the operand full-text 
selections simultaneously. A match of an and-selection is formed by combining
matches for each of the operand full-text selections as described in
<specref ref="tq-ft-fs-FTAnd"/>. </p>

<p>For example, <code>"usability" ftand "testing"</code> will find two 
matches
in <code>/book[@number="1"]/title</code>: each of the two matches for the
FTWords selection <code>"usability"</code> (the two occurrences of the 
token "usability" in the string value of the title element) is combined 
with the single match for the FTWords <code>"testing"</code> (only one 
occurrence of the token "testing" in the title).
Since the above and-selection has at least one match, the following
expression will return "true". </p>

<eg role="xpath" xml:space="preserve">/book[@number="1"]/title ftcontains ("usability" ftand "testing")</eg>

<p>The following expression returns false, because "Millicent" and "Montana" are not
contained by the same <code>author</code> element in any <code>book</code>
element:</p>

<eg role="xpath" xml:space="preserve">/book/author ftcontains "Millicent" ftand "Montana"</eg>

<p>No <code>author</code> element in any <code>book</code> element 
contains both "Millicent" and "Montana". Therefore, for any such 
<code>author</code> element, there are either one match for the 
FTWords <code>"Millicent"</code> and zero matches for the FTWords 
<code>"Montana"</code>, or vice versa, or no matches for both
of them. In any of these cases, the and-selection will have zero 
matches.</p>
</div3>

<div3 id="sec-ftmildnot">
<head>Mild-Not Selection</head>
<p><termdef id="dt-mild-not-selection" term="mild-not selection">A
<term>mild-not selection</term> combines two full-text selections 
using the <code>not in</code> operator.</termdef>
</p>

<p>The <code>not in</code> operator is a milder form of the operator combination
<code>ftand ftnot</code>. The selection <code>A not in B</code> matches a token
sequence that matches <code>A</code>, but not when it is a part of a 
match of <code>B</code>. 
In contrast, <code>A ftand ftnot B</code> only finds matches, when the token 
sequence contains <code>A</code> and does not contain <code>B</code>.</p>

<p>
As an example, consider a search for <code>"Mexico" not in "New Mexico"</code>.
This may return, among others, a document
which is all about "Mexico" but mentions at the end that "New Mexico
was named after Mexico". The occurrence of "Mexico" in "New Mexico" is not 
considered, but other occurrences of "Mexico" are matched. Note that this
document would not be matched by the full-text selection 
<code>"Mexico" ftand ftnot "New Mexico"</code>.</p>

<p> A match to a mild-not selection must
contain at least one token occurrence that satisfies the first
condition and does not satisfy the second condition. If it contains a
token occurrence that satisfies both the first and the second
condition, the occurrence is not considered as a match.</p>

<p>The following expression returns true, because "usability" appears in the
<code>title</code> and the <code>p</code> elements and the occurrence within
the phrase "Usability Testing" in the <code>title</code> element is not
considered:</p>

<eg role="xpath" xml:space="preserve">/book ftcontains "usability" not in "usability
testing"</eg>

<p>Operands of a mild-not selection may not contain a full-text selection
that evaluates to an <term>AllMatches</term> that contains a <term>StringExclude</term>. Such 
full-text selections are not-selection and 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt> with a cardinality constraint using <code>at most</code>,
<code>from ... to</code>, and <code>exactly</code> occurrences ranges. </p>
</div3>

<div3 id="sec-ftnot">
<head>Not-Selection</head>
<p><termdef id="dt-unary-not-selection" term="not-selection">A
<term>not-selection</term> is a full-text selection starting with the prefix 
operator <code>ftnot</code>.</termdef></p>

<p>A not-selection selects matches that do not
satisfy the operand full-text selection.
Details about how such matches are constructed are given in <specref ref="tq-ft-fs-FTUnaryNot"/>.
</p>

<p>The following expression returns the empty sequence, because all <code>book</code>
elements contain "usability":</p>
<eg role="xpath" xml:space="preserve">/book[. ftcontains ftnot "usability"]</eg>

<p>The following expression returns true, because <code>book</code> elements contain
"information" and "retrieval" but not "information retrieval":</p>

<eg role="xpath" xml:space="preserve">/book ftcontains "information" ftand
"retrieval" ftand ftnot "information retrieval"</eg>

<p>The following expression returns <code>book</code> elements containing "web site
usability" but not "usability testing":</p>

<eg role="xpath" xml:space="preserve">/book[. ftcontains "web site usability" ftand 
ftnot "usability testing"]</eg>
</div3>

</div2>

<div2 id="ftposfilter">
        <head>Positional Filters</head>

<scrap headstyle="show"><head/>
	<prod num="157" id="doc-xquery-FTPosFilter"><lhs>FTPosFilter</lhs><rhs><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTOrder" xlink:type="simple">FTOrder</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWindow" xlink:type="simple">FTWindow</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTDistance" xlink:type="simple">FTDistance</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTScope" xlink:type="simple">FTScope</nt>  |  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTContent" xlink:type="simple">FTContent</nt></rhs></prod>
</scrap>


<p><termdef id="dt-ftposfilter" term="positional filter">
<term>Positional filters</term> are postfix operators that serve to
filter matches based on various constraints on their positional
information.</termdef></p>

<p>
Recall that the grammar rule for <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTSelection" xlink:type="simple">FTSelection</nt>
allows an arbitrary number of positional filters to follow an
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTOr" xlink:type="simple">FTOr</nt>. Multiple adjacent positional filters are
applied from left to right, i.e., the first filter is applied to the
result of the <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTOr" xlink:type="simple">FTOr</nt>, the second is applied to the
result of that first application, and so on.
</p>

<div3 id="ftorder">
	<head>Ordered Selection</head>

<scrap headstyle="show"><head/>
<prod num="158" id="doc-xquery-FTOrder"><lhs>FTOrder</lhs><rhs>"ordered"</rhs></prod>
</scrap>

<p><termdef id="dt-ordered-selection" term="ordered selection">An
<term>ordered selection</term> consist of a full-text selection followed by 
the postfix operator "ordered".</termdef>

An ordered selection controls the order of tokens and
phrases to be the same as the order in which they are written in the
operand selection.
</p>

<p> The default is unordered. Unordered is in effect when ordered is
not specified in the query. Unordered cannot be written explicitly in
the query.  </p>

<p>An ordered selection selects matches which satisfy the operand full-text
selection and for which the order the matching tokens have in the text
is the same order that the corresponding query tokens have in the
operand selection.</p>

<p>The following expression returns true, because titles of <code>book</code> elements
contain "web site" and "usability" in the order in which they are written in
the query, i.e., "web site" must precede "usability":</p>

<eg role="xpath" xml:space="preserve">/book/title ftcontains ("web site" ftand "usability")
ordered </eg>

<p>The following expression returns false, because although "Montana" and "Millicent"
both appear in the <code>book</code> element, they do not appear in the order they
are written in the query:</p>

<eg role="xpath" xml:space="preserve">/book[@number="1"] ftcontains ("Montana" ftand
"Millicent") ordered </eg>


</div3>

<div3 id="ftwindow">
	<head>Window Selection</head>

<scrap headstyle="show"><head/>
	<prod num="159" id="doc-xquery-FTWindow"><lhs>FTWindow</lhs><rhs>"window"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-AdditiveExpr" xlink:type="simple">AdditiveExpr</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTUnit" xlink:type="simple">FTUnit</nt></rhs></prod>
        <prod num="161" id="doc-xquery-FTUnit"><lhs>FTUnit</lhs><rhs>"words"  |  "sentences"  |  "paragraphs"</rhs></prod>
</scrap>

<p><termdef id="dt-window-selection" term="window selection">A
<term>window selection</term> consist of a full-text selection followed
by one of the (complex) postfix operators derived from <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWindow" xlink:type="simple">FTWindow</nt>.</termdef> 
A window selection selects matches which satisfy the operand full-text
selection and for which the matched tokens and phrases, more precisely the 
individual StringIncludes of that match, are found
within a number of <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTUnit" xlink:type="simple">FTUnit</nt>s (words, sentences, and paragraphs). The number of <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTUnit" xlink:type="simple">FTUnit</nt>s is
specified by an AdditiveExpr that is converted as though it were an argument to a
function with the expected type of "xs:integer".</p>

<p>A window selection may cross element
boundaries. The size of the window is not affected by the presence or
absence of element boundaries. Stop words are included in the
computation of the window size whether they are ignored by the query or not.</p>

<p>A match of an <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTSelection" xlink:type="simple">FTSelection</nt> is
considered a match within a window, if there exists a window of at most the
given number of consecutive units (tokens, sentences, or paragraphs) in
the document within which all StringIncludes of the match lie. 
</p>

<p>The following expression returns true, because "web", "site", and "usability" are
within a window of 5 tokens in the <code>title</code> element:</p>

<eg role="xpath" xml:space="preserve">/book/title ftcontains "web" ftand "site"
ftand "usability" window 5 words</eg>

<p>The following expression returns true, because "web" and "site" in the order they are
written in the query and either "usability" or "testing" are within a
window of at most 10 tokens:</p>

<eg role="xpath" xml:space="preserve">/book ftcontains ("web" ftand "site" ordered)
ftand ("usability" ftor "testing") window 10 words</eg>

<p>The following expression returns true, because the <code>title</code> element
contains "Web Site Usability". A similar query on the <code>p</code> element
would not return true, 
because its occurrences of "web site" and "usability" are not within a
window of 3:</p>
<eg role="xpath" xml:space="preserve">/book//title ftcontains "web site" ftand
"usability" window 3 words</eg>

<p>The following expression returns the empty sequence, because in the selected
<code>book</code> element, there is no occurrence of "efficient"
within a window of 3 tokens which would not also contain an occurrence
of "and":</p>

<eg role="xpath" xml:space="preserve">/book[@number="1" and . ftcontains "efficient" 
ftand ftnot "and" window 3 words]</eg>

<p>
In order to allow meaningful results for nested positional filters,
e.g., a window selection embedded inside a distance selection, the
resulting matches for window selections are formed from the input matches
that satisfy the window constraint as follows. All StringIncludes of
such a match are coerced into a single StringInclude that spans all
token positions from the smallest to the largest position of any input
StringIncludes. This is explained in more detail in Section <specref ref="ftdistance"/>.
</p>
</div3>

<div3 id="ftdistance">
	<head>Distance Selection</head>

<scrap headstyle="show"><head/>
<prod num="160" id="doc-xquery-FTDistance"><lhs>FTDistance</lhs><rhs>"distance"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTRange" xlink:type="simple">FTRange</nt>  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTUnit" xlink:type="simple">FTUnit</nt></rhs></prod>
<prod num="156" id="doc-xquery-FTRange"><lhs>FTRange</lhs><rhs>("exactly"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-AdditiveExpr" xlink:type="simple">AdditiveExpr</nt>)<br/>|  ("at"  "least"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-AdditiveExpr" xlink:type="simple">AdditiveExpr</nt>)<br/>|  ("at"  "most"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-AdditiveExpr" xlink:type="simple">AdditiveExpr</nt>)<br/>|  ("from"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-AdditiveExpr" xlink:type="simple">AdditiveExpr</nt>  "to"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xquery-AdditiveExpr" xlink:type="simple">AdditiveExpr</nt>)</rhs></prod>
</scrap>

<p><termdef id="dt-distance-selection" term="distance selection">A
<term>distance selection</term> consist of a full-text selection followed
by one of the (complex) postfix operators derived from <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTDistance" xlink:type="simple">FTDistance</nt>.</termdef> </p>

<p>A distance selection selects matches which satisfy the operand full-text
selection and for which the matched tokens and phrases satisfy the
specified distance conditions. Distance is specified in units of
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTUnit" xlink:type="simple">FTUnit</nt>s (words, sentences, and
paragraphs). The number of intervening 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTUnit" xlink:type="simple">FTUnit</nt>s is specified in the integer value of
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTRange" xlink:type="simple">FTRange</nt>. </p>
 
<p><nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTRange" xlink:type="simple">FTRange</nt> specifies a range of integer
values, providing a minimum and maximum value.  Each one of the AdditiveExpr
specified in an <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTRange" xlink:type="simple">FTRange</nt> is converted as though it were an
argument to a function with the expected parameter type of
"xs:integer".</p>

<p>Let the value of the first (or only) operand be M.  If "from" is
specified, let the value of the second operand be N. 
A distance selection may cross element boundaries when computing
distance.
</p>

<p>The following rule applies to the computation of distance:</p>

<ulist>
<item><p> Zero words (sentences, paragraphs) means adjacent tokens
(sentences, paragraphs).</p></item>
</ulist>

<p>If "exactly" is specified, then the range is the closed interval [M, 
M].  If "at least" is specified, then the range is the half-closed interval 
[M, unbounded).  If "at most" is specified, then the range is the closed 
interval [0, M].  If "from-to" is specified, then the range is the closed 
interval [M, N]. Note: If M is greater then N, the range is empty. </p>

<p>Here are some examples of  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTRange" xlink:type="simple">FTRange</nt>s:</p>

<olist><item><p>'exactly 0' specifies the range [0, 0].</p></item>
    <item><p>'at least 1' specifies the range [1,unbounded].</p></item> 
    <item><p>'at most 1' specifies the range [0, 1]. </p></item>
    <item><p>'from 5 to 10' specifies the range [5, 10].</p></item>
</olist>

<p>The distances computed by a 
<termref def="dt-distance-selection">distance selection</termref> are not
affected by the presence or absence of element boundaries in the text.
Stop words are counted in those computations whether they are ignored
or not.</p> 


<p>The following expression returns false, because "completion" and "errors" are
less than 11 tokens apart:</p>

<eg role="xpath" xml:space="preserve">/book ftcontains ("completion" ftand "errors" 
distance at least 11 words)</eg>

<p>The following expression returns true, because the <code>book</code> element
contains tokens "web", "site", and "usability" that have at
most 2 intervening tokens between them:</p>

<eg role="xpath" xml:space="preserve">/book ftcontains "web" ftand "site" ftand
"usability" distance at most 2 words</eg>

<p>The following expression returns the empty sequence, because 
between any token "usability" and the token in any occurrence of the phrase 
"web site" that is the nearest to the token "usability" there is always more 
than one intervening token: </p>

<eg role="xpath" xml:space="preserve">/book[.//p ftcontains "web site"
ftand "usability" distance at most 1 words] </eg>

<p>The following expression returns the <code>book</code> title, because for 
the occurrences of the tokens "web" and "users" in the <code>note</code> 
element only one intervening token appears: </p>

<eg role="xpath" xml:space="preserve">/book[. ftcontains "web"
ftand "users" distance at most 1 words]/title </eg>

<!-- JD, 2005-08-17: need to revise the foll. 2 examples;
<p>The following expression returns the
<code>title</code> element, because the token "learning" not appears
within 15 tokens of the tokens "web site" and "completion":</p>

<eg role="xpath">/book[@number="1" and . ftcontains ("web site"
ftand "completion" ftand ftnot  "learning") distance
exactly 15 words]/title
</eg>

<p>The following expression returns the <code>title</code>
element if the tokens "web site" and "completion" appear within 15
tokens of each other and in the same paragraph:</p>
<eg role="xpath">/book[@number="1" and . ftcontains "web site"
ftand "completion" distance exactly 15 words same
paragraph]/title </eg>

-->
<p>
In order to allow meaningful results for nested positional filters,
e.g., a distance selection embedded inside another distance selection, the
resulting matches for distance selections are formed from the input matches
that satisfy the distance constraint as follows. All StringIncludes of
such a match are coerced into a single StringInclude that spans all
token positions from the smallest to the largest position of any input
StringIncludes. Thus, a distance selection that embeds a window or a
distance selection takes the result of the embedded selection as a
single unit.
</p>

<p>
The following gives an example of nested distance selections:
</p>

<eg role="xpath" xml:space="preserve">/books ftcontains ((("richard" ftand "nixon") distance at most 2) 
                   ftand 
                   (("george" ftand "bush") distance at most 2) 
                  distance at least 20)</eg>

<p>
This expression allows to find <code>book</code> elements that contain, for instance, 
"Richard M. Nixon"  and "George W. Bush" at least 20 words apart. The
matches for the inner distance selections are treated as single units
(represented by StringIncludes) by the outer distance
selection. Suppose such phrases are present in 
the search context, then the outer distance selection
enforces a constraint on the number of intervening tokens ("at least
20") between the
last token of "Richard M. Nixon" and the first token of "George
W. Bush".
</p>


</div3>

<div3 id="ftscope">
	<head>Scope Selection</head>

<scrap headstyle="show"><head/>
<prod num="162" id="doc-xquery-FTScope"><lhs>FTScope</lhs><rhs>("same"  |  "different")  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTBigUnit" xlink:type="simple">FTBigUnit</nt></rhs></prod>
<prod num="163" id="doc-xquery-FTBigUnit"><lhs>FTBigUnit</lhs><rhs>"sentence"  |  "paragraph"</rhs></prod>
</scrap>

<p><termdef id="dt-scope-selection" term="scope selection">A
<term>scope selection</term> consist of a full-text selection followed
by one of the (complex) postfix operators derived from <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTScope" xlink:type="simple">FTScope</nt>.</termdef> </p>

<p>A scope selection selects matches which satisfy the operand full-text
selection and for which the matched tokens and phrases are
contained in the same scope or in different scopes. </p>

<p> Possible scopes are sentences and paragraphs. </p>

<p> By default, there are no restrictions on the scope of the
matches. </p>

<p>The following expression returns false, because the tokens "usability" and "Marigold"
are not contained within the same sentence:</p>

<eg role="xpath" xml:space="preserve">/book ftcontains "usability"
ftand "Marigold" same sentence</eg>

<p>The following expression returns true, because the tokens "usability" and "Marigold"
are contained within different sentences: </p>

<eg role="xpath" xml:space="preserve">/book ftcontains "usability"
ftand "Marigold" different sentence</eg>

<p>The following expression returns a <code>book</code> element, because it contains
"usability" and "testing" in the same paragraph:</p>

<eg role="xpath" xml:space="preserve">/book[. ftcontains "usability" ftand "testing"
same paragraph] </eg>

<p>The following expression returns a <code>book</code> element, because "site" and
"errors" appear in the same sentence:</p>

<eg role="xpath" xml:space="preserve">/book[. ftcontains "site" ftand "errors"
same sentence] </eg>


<p>It is possible that both "same sentence" and "different sentence" conditions are
simultaneously safisfied for several tokens and/or phrases within the same 
document fragment. This can be observed if there are occurrences of the tokens
and/or phrases both within the same sentence and within difference sentences. For
example, consider the following document fragment. </p>

<eg role="parse-test" xml:space="preserve">
&lt;introduction&gt;
... The usability of a Web site is how well the site supports the user in
achieving specified goals. ... Expert reviews and usability testing are methods of
identifying problems in layout, terminology, and navigation. ...
&lt;/introduction&gt;
</eg>

<p>This sample will satisfy both conditions <code>("usability" ftand "reviews")
different sentence</code> and <code>("usability" ftand "reviews") same
sentence</code>. The tokens "usability" and "reviews" occur both in different sentences
(the first and second shown sentences) and in the same sentence (the second shown
sentences.) </p>

<p>The above observation also holds for the "same paragraph" and "different paragraph"
conditions.</p>


</div3>

<div3 id="ftcontent">
	<head>Anchoring Selection</head>

<scrap headstyle="show"><head/>
			<prod num="164" id="doc-xquery-FTContent"><lhs>FTContent</lhs><rhs>("at"  "start")  |  ("at"  "end")  |  ("entire"  "content")</rhs></prod>
</scrap>
 
<p><termdef id="dt-anchoring-selection" term="anchoring selection">An
<term>anchoring selection</term> consist of a full-text selection followed
by one of the postfix operators "at start", "at end", or "entire content".</termdef> </p>

<p>An anchoring selection selects matches which satisfy the operand full-text
selection and for which the matched tokens and phrases are
the first, last, or all tokens in the tokenized form of the items being searched.
</p>

<p> Using the "at start" operator tokens or phrases are matched which
are the first tokens or phrases in the tokenized string value 
of the item being searched.</p>

<p> Using the "at end" operator tokens or phrases are matched which
are the last tokens or phrases in the tokenized string value of the
item being searched.</p>
 
<p>Using the "entire content" operator tokens or phrases are matched
which are the entire content of the tokenized string value of the 
item being searched.</p>
 
<p>The following expression returns each <code>title</code> element starting with the
phrase "improving the usability of a web site":</p>
<eg role="xpath" xml:space="preserve">/books//title[. ftcontains "improving the usability
of a web site" at start]</eg> 

<p>The following expression returns each <code>p</code> element ending with the phrase
"propagating few errors":</p>
 
<eg role="xpath" xml:space="preserve">/books//p[. ftcontains "propagat.*" with wildcards ftand "few
errors" distance at most 2 words at end]</eg>

<p>The following expression returns each <code>note</code> element whose entire content
is "this site has been approved by the web site users association":</p>

<eg role="xpath" xml:space="preserve">/books//note[. ftcontains "this site has been
approved by the web site users association" entire content]</eg>

</div3>


</div2>


<div2 id="fttimes">
	<head>Cardinality Selection</head>

<scrap headstyle="show"><head/>
			<prod num="155" id="doc-xquery-FTTimes"><lhs>FTTimes</lhs><rhs>"occurs"  <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTRange" xlink:type="simple">FTRange</nt>  "times"</rhs></prod>
</scrap>

<p><termdef id="dt-cardinality-selection" term="cardinality selection">A
<term>cardinality selection</term> consist of an 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt> followed
by the <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTTimes" xlink:type="simple">FTTimes</nt> postfix operator.</termdef>
A cardinality selection selects matches for which the operand 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt> is matched a specified number of
times. </p>

<p>A cardinality selection limits the number of different
matches of <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt> within the
specified range. The semantics of FTRange are described in 
<specref ref="ftdistance"/>. </p>

<p>In the document fragment "very very big":</p>

<olist>

<item>
<p>
The <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTW