This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 2485 - tokenization parameters: for data for query or for both ?
Summary: tokenization parameters: for data for query or for both ?
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Full Text 1.0 (show other bugs)
Version: Working drafts
Hardware: Macintosh All
: P2 normal
Target Milestone: ---
Assignee: Sihem Amer-Yahia
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-11-08 18:08 UTC by Daniela Florescu
Modified: 2006-02-20 19:29 UTC (History)
0 users

See Also:


Attachments

Description Daniela Florescu 2005-11-08 18:08:14 UTC
This issue concerns the options that effect the tokenization process.

The question is: are they influencing only the tokenization of the
query string or of both the query string and the data string ?
Comment 1 Michael Rys 2005-11-08 18:16:52 UTC
Note that in order to have systems that perform a priori tokenization of the 
search corpus using FT indices (like all the ones we have), the tokenization 
parameters of ftcontains should only apply to the search strings and not the 
input data. 
Comment 2 Sihem Amer-Yahia 2006-01-30 16:54:29 UTC
Added a note in Section 4:

Because tokenization is implementation-defined, the
tokenization of each item in $searchContext does not necessarily take
into account the match options in $matchOptions. This allows
implementations to tokenize and index input data without the knowledge
of particular match options used in full-text queries.