This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 4723 - [FT] editorial: 4.1.2 Representations of Tokenized Text and Matching
Summary: [FT] editorial: 4.1.2 Representations of Tokenized Text and Matching
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Full Text 1.0 (show other bugs)
Version: Last Call drafts
Hardware: All All
: P2 minor
Target Milestone: ---
Assignee: Mary Holstege
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-06-23 10:11 UTC by Michael Dyck
Modified: 2007-10-26 23:57 UTC (History)
0 users

See Also:


Attachments

Description Michael Dyck 2007-06-23 10:11:13 UTC
4.1.2 Representations of Tokenized Text and Matching

[1]
paras 2-4
"A [Definition: ..."
    Put the "A" inside the definition.

[2]
para 3
"relative position of the search string in the query in document order"
    There's no such thing as document order for things in the query.

[3]
bullets
    It'd be a bit easier to read if you put the name of the property at
    the start of the bullet-blurb, rather than the end.

[4]
startPos & endPos
"a unique identifier that captures the relative position..."
    [4a]
    The other properties just say "the relative position...". What's the
    significance of this difference?

    [4b]
    "unique":
    Not really helpful unless you specify the arena over which it's
    unique. E.g., "Within the results of a single tokenization, the
    startPos of any two distinct TokenInfos are distinct."

    [4c]
    "identifier"
    s/identifier/integer/, given the way it's used and given the schema in
    4.2.1.3.
    For that matter, at "Each TokenInfo is associated with", you could
    append "six integer-valued properties".
Comment 1 Mary Holstege 2007-08-23 20:13:55 UTC
WRT comments #1 and #3: classified as editorial and done.
Comment 2 Mary Holstege 2007-10-11 23:24:15 UTC
Comments #2 and #4 fixed up as part of fix wrt overlapping tokens. Done.