This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 3939 - [FT] Section 4.1.1: Example for overlapping tokens
Summary: [FT] Section 4.1.1: Example for overlapping tokens
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Full Text 1.0 (show other bugs)
Version: Working drafts
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Jim Melton
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-11-01 00:45 UTC by Michael Rys
Modified: 2007-04-20 15:44 UTC (History)
0 users

See Also:


Attachments

Description Michael Rys 2006-11-01 00:45:01 UTC
Provide an example for overlapping tokens. Like German compound words such as Donaudampfschifffahrtskapitaensmuetze
Comment 1 Jim Melton 2007-02-01 21:57:36 UTC
The Task Force has agreed to provide such an example.  In Section 4.1, Tokenization, immediately prior to section 4.1.1, we will insert a paragraph that reads:
For some languages, some tokenizers may identify overlapping tokens.  For example, the German word "Donaudampfschifffahrtskapitaensmuetzen" might be tokenized into the following tokens: Donaudampfschifffahrtskapitaensmuetzen, Donau, dampf, schiff, dampfschiff, kapitaen, muetzen, kapitaensmuetzen, schifffahrt, dampfschifffahrt, and perhaps others. 
Comment 2 Jochen Doerre 2007-04-09 09:39:25 UTC
Done.
Comment 3 Jim Melton 2007-04-20 15:44:15 UTC
Because you participated in the TF when this bug was resolved, we presume that
your concerns are addressed appropriately.  We are therefore marking this bug
as CLOSED.