This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 3742 - [FT] language=none not useful
Summary: [FT] language=none not useful
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Full Text 1.0 (show other bugs)
Version: Working drafts
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Mary Holstege
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-09-18 19:44 UTC by Mary Holstege
Modified: 2006-11-01 17:20 UTC (History)
0 users

See Also:


Attachments

Description Mary Holstege 2006-09-18 19:44:33 UTC
Section 3.2.6 (FTLanguageOption)
Appendix C (Static Context)
Appendix I (Checklist of Implementation-Defined)

language="none" is not a sensible choice. Tokenization is intrinsically language-dependent and full-text search is intrinsically dependent on tokenization. I believe we should drop the language=none and require the static context to define a specific (implementation-defined) default language.
Comment 1 Mary Holstege 2006-09-18 20:03:56 UTC
This overstates the rationale somewhat: there are many languages for which a simple whitespace/punctuation-delimited tokenization does the job.  However, there are also languages where tokenization is entirely language-dependent.  Further, stemming is certainly entirely language dependent, so at the very least the spec needs to say what a stemmed query with language=none means.
In general, we still prefer that language=none be dropped, so that those kinds of messy questions can be avoided entirely.
Comment 2 Mary Holstege 2006-11-01 17:19:59 UTC
The WG decided to change the specification:
The default language is specified in the static context. Remove None. 
Add "in an implementation-defined way" at the end of the sentence: The
"language" option influences tokenization, stemming, and stop words. 
3rd para DELETE 1) either 2), or be the value "none". 
DELETE the 4th para: If the language "none" option is specified, no
language selected.
6th para change to: The default language is specified in the static
context.
Annex C: 1) correct FTLanguageOption row: Instead of "no language is
selected" use" implementation-defined" 2) correct last column eliminate
"or none".