Errata for XQuery 1.0 and XPath 2.0 Functions and Operators

23 November 2007

Latest version:
http://www.w3.org/XML/2007/qt-errata/xpath-functions-errata.html
Editor:
Michael Kay, Saxonica http://www.saxonica.com/

Abstract

This document addresses errors in the XQuery 1.0 and XPath 2.0 Functions and Operators Recommendation published on 23 January 2007. It records all errors that, at the time of this document's publication, have solutions that have been approved by the XSL Working Group and/or the XML Query Working Group. For updates see the latest version of that document.

The errata are numbered, classified as Substantive, Editorial, or Markup, and are listed in reverse chronological order of their date of origin. Each entry contains the following information:

Colored boxes and shading are used to help distinguish new text from old, however these visual clues are not essential to an understanding of the change. The styling of old and new text is an approximation to its appearance in the published Recommendation, but is not normative. Hyperlinks are shown underlined in the erratum text, but the links are not live.

A number of indexes appear at the end of the document.

Substantive corrections are proposed by the XSL Working Group and/or the XQuery Working Group (both part of the XML Activity), which have consensus that they are appropriate; they are not to be considered normative until approved by a Call for Review of Proposed Corrections or a Call for Review of an Edited Recommendation.

Please report errors in this document using W3C's public Bugzilla system (instructions can be found at http://www.w3.org/XML/2005/04/qt-bugzilla). If access to that system is not feasible, you may send your comments to the W3C XSLT/XPath/XQuery public comments mailing list, public-qt-comments@w3.org. It will be very helpful if you include the string [FOerrata] in the subject line of your report, whether made in Bugzilla or in email. Each Bugzilla entry and email message should contain only one error report. Archives of the comments and responses are available at http://lists.w3.org/Archives/Public/public-qt-comments/.

Status of this Document

This is a public draft. None of the errata reported in this document have been approved by a Call for Review of Proposed Corrections or a Call for Review of an Edited Recommendation. As a consequence, they must not be considered to be normative.

Table of Contents

  Errata

     FO.E16   In fn:lang, the list item numbers (1) and (2) are duplicated.

     FO.E15   In fn:namespace-uri, the terminology "the namespace URI of the xs:QName of $arg" is incorrect.

     FO.E14   In fn:normalize-space, a sentence with multiple conditions is ambiguously worded.

     FO.E13   The conditions under which a node has the is-id or is-idref property need to be clarified.

     FO.E12   When multiplying or dividing a yearMonthDuration by a number, rounding behavior is underspecified.

     FO.E11   Although the specification states that a string literal can be cast to an xs:QName or xs:NOTATION, the semantics of the operation are not described in the obvious place.

     FO.E10   In 17.1.2, the procedure for casting xs:NOTATION to xs:string does not work because it uses functions that are defined only on xs:QName.

     FO.E9   In Appendix D, the function signature of the fn:translate function is quoted incorrectly.

     FO.E8   A character code confuses decimal and hexadecimal notation

     FO.E7   The meaning of the regex flag "m" is unclear when the last character in the string is a newline

     FO.E6   Casting from date and time type to string represents the UTC timezone as "+00:00" rather than as "Z".

     FO.E5   The function signatures for the internal functions op:subtract-dates and op:subtract-dateTimes incorrectly allow an empty sequence as the return value.

     FO.E4   The regex specification allows a back-reference within square brackets, which is meaningless.

     FO.E3   An example under fn:idref is incorrectly formatted

     FO.E2   The description of fn:subsequence contains a spurious variable $p

     FO.E1   In fn:resolve-uri it is unclear what happens when the supplied base URI is a relative reference

  Indexes

    Index by affected section

    Index by Bugzilla entry

    Index by function

    Index by error-code

    Index by type


FO.E16 - markup

See Bug 5246

Description

In fn:lang, the list item numbers (1) and (2) are duplicated.

History

15 Nov 2007: Proposed

Change

In 14.5 fn:lang (seventh paragraph, first numbered list):

Replace the text:

  1. (1) $testlang is equal to the string-value of the relevant xml:lang attribute, or

  2. (2) $testlang is equal to some substring of the string-value of the relevant xml:lang attribute that starts at the start of the string-value and ends immediately before a hyphen, "-" (The character "-" is HYPHEN-MINUS, #x002D).

By:

  1. $testlang is equal to the string-value of the relevant xml:lang attribute, or

  2. $testlang is equal to some substring of the string-value of the relevant xml:lang attribute that starts at the start of the string-value and ends immediately before a hyphen, "-" (The character "-" is HYPHEN-MINUS, #x002D).

FO.E15 - editorial

See Bug 5235

Description

In fn:namespace-uri, the terminology "the namespace URI of the xs:QName of $arg" is incorrect. It's not clear that it's referring to the name of the node, rather than (say) its type annotation.

History

15 Nov 2007: Proposed

Change

In 14.3 fn:namespace-uri (first paragraph):

Replace the text:

Summary: Returns the namespace URI of the xs:QName of $arg.

By:

Summary: Returns the namespace URI part of the name of $arg, as an xs:anyURI value.

FO.E14 - editorial

See Bug 4974

Description

In fn:normalize-space, a sentence with multiple conditions is ambiguously worded. To solve the problem, the relevant sentence can be simplified, because it doesn't need to say what happens when the argument is "." and there is no context item; that's covered in the rules for evaluating ".".

History

25 Sep 2007: Accepted

Change

In 7.4.5 fn:normalize-space (fourth paragraph):

Replace the text:

If no argument is supplied, $arg defaults to the string value (calculated using fn:string()) of the context item (.). If no argument is supplied or if the argument is the context item and the context item is undefined an error is raised: [err:XPDY0002]XP.

By:

If no argument is supplied, then $arg defaults to the string value (calculated using fn:string()) of the context item (.). If no argument is supplied and the context item is undefined an error is raised: [err:XPDY0002]XP.

FO.E13 - substantive

See Bug 4519

Description

The conditions under which a node has the is-id or is-idref property need to be clarified. (See also corresponding erratum DM.E005 to XDM)

History

16 Oct 2007: Proposed

Changes

  1. In 15.5.2 fn:id (first notes, second paragraph):

    Replace the text:

    If the data model is constructed from a PSVI, an element or attribute will have the is-id property if its schema-defined type is xs:ID or a type derived by restriction from xs:ID.

    By:

    If the data model is constructed from a PSVI, an element or attribute will have the is-id property if its typed value is a single atomic value of type xs:ID or a type derived by restriction from xs:ID.

  2. In 15.5.3 fn:idref (first notes):

    Insert after the text:

    Notes:

    An element or attribute typically acquires the is-idrefs property by being validated against the schema type xs:IDREF or xs:IDREFS, or (for attributes only) by being described as of type IDREF or IDREFS in a DTD.

    No error is raised in respect of a candidate ID value that does not match the IDREF value of any element or attribute in the document. If no candidate ID value matches the IDREF value of any element or attribute, the function returns the empty sequence.

    It is possible for two or more nodes to have an IDREF value that matches a given candidate ID value. In this situation, the function will return all such nodes. However, each matching node will be returned at most once, regardless how many candidate ID values it matches.

    It is possible in a well-formed but invalid document to have a node whose is-idrefs property is true but that does not conform to the lexical rules for the xs:IDREF type. The effect of the above rules is that ill-formed candidate ID values and ill-formed IDREF values are ignored.

    The following:

    If the data model is constructed from a PSVI, the typed value of a node that has the is-idrefs property will contain at least one atomic value of type xs:IDREF (or a type derived by restriction from xs:IDREF). It may also contain atomic values of other types. These atomic values are treated as candidate ID values if their lexical form is valid as an xs:NCName, and they are ignored otherwise.

FO.E12 - substantive

See Bug 4621

Description

When multiplying or dividing a yearMonthDuration by a number, rounding behavior is underspecified.

History

27 Jun 2007: Proposed

Changes

  1. In 10.6.3 op:multiply-yearMonthDuration (first paragraph):

    Replace the text:

    Summary: Returns the result of multiplying the value of $arg1 by $arg2. The result is rounded to the nearest month. For a value v, 0 <= v < 0.5 rounds to 0; 0.5 <= v < 1.0 rounds to 1.

    By:

    Summary: Returns the result of multiplying the value of $arg1 by $arg2. The result is rounded to the nearest month.

    The result is the xs:yearMonthDuration whose length in months is equal to the result of applying the fn:round function to the value obtained by multiplying the length in months of $arg1 by the value of $arg2.

  2. In 10.6.4 op:divide-yearMonthDuration (first paragraph):

    Replace the text:

    Summary: Returns the result of dividing the value of $arg1 by $arg2. The result is rounded to the nearest month. For a value v, 0 <= v < 0.5 rounds to 0; 0.5 <= v < 1.0 rounds to 1.

    By:

    Summary: Returns the result of dividing the value of $arg1 by $arg2. The result is rounded to the nearest month.

    The result is the xs:yearMonthDuration whose length in months is equal to the result of applying the fn:round function to the value obtained by dividing the length in months of $arg1 by the value of $arg2.

FO.E11 - editorial

See Bug 4874

Description

Although the specification states that a string literal can be cast to an xs:QName or xs:NOTATION, the semantics of the operation are not described in the obvious place. This erratum adds a cross-reference.

History

29 Jul 2007: Proposed

Change

In 17.1.1 Casting from xs:string and xs:untypedAtomic (fifth paragraph):

Replace the text:

Casting is permitted from xs:string literals to xs:QName and types derived from xs:NOTATION. If the argument to such a cast is computed dynamically, [err:XPTY0004]XP is raised if the value is of any type other than xs:QName or xs:NOTATION respectively (including the case where it is an xs:string).

By:

Casting is permitted from xs:string literals to xs:QName and types derived from xs:NOTATION. If the argument to such a cast is computed dynamically, [err:XPTY0004]XP is raised if the value is of any type other than xs:QName or xs:NOTATION respectively (including the case where it is an xs:string). The process is described in more detail in 5.3 Constructor Functions for xs:QName and xs:NOTATION.

FO.E10 - substantive

See Bug 4874

Description

In 17.1.2, the procedure for casting xs:NOTATION to xs:string does not work because it uses functions that are defined only on xs:QName.

History

29 Jul 2007: Proposed

Change

In 17.1.2 Casting to xs:string and xs:untypedAtomic (first bulleted list, third item, first bulleted list, first item):

Replace the text:

if the qualified name has a prefix TV is (fn:concat(fn:prefix-from-QName(SV), ":", fn:local-name-from-QName(SV)).

By:

if the qualified name has a prefix TV is the concatenation of the prefix of SV, a single colon (:), and the local name of SV.

FO.E9 - editorial

See Bug 4549

Description

In Appendix D, the function signature of the fn:translate function is quoted incorrectly.

History

9 May 2007: Proposed

Change

In D Compatibility with XPath 1.0 (Non-Normative) (first table, first table body, twenty-eighth row, first column):

Replace the text:

fn:translate( $arg  as xs:string?,
$mapString  as xs:string?,
$transString  as xs:string?) as xs:string

By:

fn:translate( $arg  as xs:string?,
$mapString  as xs:string,
$transString  as xs:string) as xs:string

FO.E8 - editorial

See Bug 4545

Description

A character code confuses decimal and hexadecimal notation

History

8 May 2007: Proposed

25 Sep 2007: Corrected

Change

In 7.4.11 fn:iri-to-uri (first notes, second paragraph):

Replace the text:

The following printable ASCII characters are invalid in an IRI: "<", ">", " " " (double quote), space, "{", "}", "|", "\", "^", and "`". Since these characters should not appear in an IRI, if they do appear in $iri they will be percent-encoded. In addition, characters outside the range x20-x126 will be percent-encoded because they are invalid in a URI.

By:

The following printable ASCII characters are invalid in an IRI: "<", ">", " " " (double quote), space, "{", "}", "|", "\", "^", and "`". Since these characters should not appear in an IRI, if they do appear in $iri they will be percent-encoded. In addition, characters outside the range x20-x7E will be percent-encoded because they are invalid in a URI.

FO.E7 - substantive

See Bug 4543

Description

The meaning of the regex flag "m" is unclear when the last character in the string is a newline

History

6 May 2007: Proposed

15 May 2007: Accepted

Change

In 7.6.1.1 Flags (first bulleted list, second item):

Replace the text:

m: If present, the match operates in multi-line mode. By default, the meta-character ^ matches the start of the entire string, while $ matches the end of the entire string. In multi-line mode, ^ matches the start of any line (that is, the start of the entire string, and the position immediately after a newline character), while $ matches the end of any line (that is, the end of the entire string, and the position immediately before a newline character). Newline here means the character #x0A only.

By:

m: If present, the match operates in multi-line mode. By default, the meta-character ^ matches the start of the entire string, while $ matches the end of the entire string. In multi-line mode, ^ matches the start of any line (that is, the start of the entire string, and the position immediately after a newline character other than a newline that appears as the last character in the string), while $ matches the end of any line ( that is, the position immediately before a newline character, and the end of the entire string if there is no newline character at the end of the string). Newline here means the character #x0A only.

FO.E6 - substantive

See Bug 4471

Description

Casting from date and time type to string represents the UTC timezone as "+00:00" rather than as "Z". This erratum changes the representation to "Z".

History

1 May 2007: Proposed

8 May 2007: Amended

8 May 2007: Accepted

27 Jun 2007: Corrected

Change

In 17.1.5 Casting to date and time types (fourth code section):

Replace the text:

declare function eg:convertTZtoString($tz as xs:dayTimeDuration?) as xs:string
{
   if (empty($tz)) then ""
   else 
     let $tzh := fn:hours-from-dayTimeDuration($tz)
     let $tzm := fn:minutes-from-dayTimeDuration($tz)
     let $plusMinus := if ($tzh >= 0) then "+" else "-"
     let $tzhString := eg:convertTo2CharString(fn:abs($tzh))
     let $tzmString := eg:convertTo2CharString(fn:abs($tzm))
     return fn:concat($plusMinus, $tzhString, ":", $tzmString)
}

                    

By:

declare function eg:convertTZtoString($tz as xs:dayTimeDuration?) as xs:string
{
   if (empty($tz)) 
     then ""
   else if ($tz eq xs:dayTimeDuration('PT0S'))
     then "Z"
   else 
     let $tzh := fn:hours-from-duration($tz)
     let $tzm := fn:minutes-from-duration($tz)
     let $plusMinus := if ($tzh >= 0) then "+" else "-"
     let $tzhString := eg:convertTo2CharString(fn:abs($tzh))
     let $tzmString := eg:convertTo2CharString(fn:abs($tzm))
     return fn:concat($plusMinus, $tzhString, ":", $tzmString)
}

                    

FO.E5 - substantive

See Bug 4448

Description

The function signatures for the internal functions op:subtract-dates and op:subtract-dateTimes incorrectly allow an empty sequence as the return value.

History

17 Apr 2007: Proposed

1 May 2007: Corrected

Changes

  1. In 10.8.1 op:subtract-dateTimes (first function signature):

    Replace the text:

    op:subtract-dateTimes( $arg1  as xs:dateTime,
    $arg2  as xs:dateTime) as xs:dayTimeDuration?

    By:

    op:subtract-dateTimes( $arg1  as xs:dateTime,
    $arg2  as xs:dateTime) as xs:dayTimeDuration
  2. In 10.8.2 op:subtract-dates (first function signature):

    Replace the text:

    op:subtract-dates($arg1 as xs:date, $arg2 as xs:date) as xs:dayTimeDuration?

    By:

    op:subtract-dates($arg1 as xs:date, $arg2 as xs:date) as xs:dayTimeDuration

FO.E4 - substantive

See Bug 4106

See Bug 4634

Description

The regex specification allows a back-reference within square brackets, which is meaningless. Furthermore, the specification doesn't say what happens when a regular expression contains a back-reference to a non-existent subexpression.

History

27 Feb 2007: Proposed

27 Jun 2007: Amended

Change

In 7.6.1 Regular Expression Syntax (starting at first bulleted list, fourth item, first paragraph):

Replace the text:

Back-references are allowed. The construct \n where n is a single digit is always recognized as a back-reference; if this is followed by further digits, these digits are taken to be part of the back-reference if and only if the back-reference is preceded by sufficiently many capturing subexpressions. A back-reference matches the string that was matched by the nth capturing subexpression within the regular expression, that is, the parenthesized subexpression whose opening left parenthesis is the nth unescaped left parenthesis within the regular expression. The closing right parenthesis of this subexpression must occur before the back-reference. For example, the regular expression ('|").*\1 matches a sequence of characters delimited either by an apostrophe at the start and end, or by a quotation mark at the start and end.

If no string is matched by the nth capturing subexpression, the back-reference is interpreted as matching a zero-length string.

Back-references change the following production:

[23] charClassEsc ::= ( SingleCharEsc | MultiCharEsc | catEsc | complEsc )

to

[23] charClassEsc ::= ( SingleCharEsc | MultiCharEsc | catEsc | complEsc | backReference )

[23a] backReference ::= "\" [1-9][0-9]*

By:

Back-references are allowed outside a character class expression. A back-reference is an additional kind of atom. The construct \n where n is a single digit is always recognized as a back-reference; if this is followed by further digits, these digits are taken to be part of the back-reference if and only if the back-reference is preceded by sufficiently many capturing subexpressions. A back-reference matches the string that was matched by the nth capturing subexpression within the regular expression, that is, the parenthesized subexpression whose opening left parenthesis is the nth unescaped left parenthesis within the regular expression. The regular expression is invalid if this subexpression does not exist or if its closing right parenthesis occurs after the back-reference. For example, the regular expression ('|").*\1 matches a sequence of characters delimited either by an apostrophe at the start and end, or by a quotation mark at the start and end.

If no string is matched by the nth capturing subexpression, the back-reference is interpreted as matching a zero-length string.

Back-references change the following production:

[9] atom ::= Char | charClass | ( '(' regExp ')' )

to

[9] atom ::= Char | charClass | ( '(' regExp ')' ) | backReference

[9a] backReference ::= "\" [1-9][0-9]*

Note:

Within a character class expression, \ followed by a digit is invalid. Some other regular expression languages interpret this as an octal character reference.

FO.E3 - markup

See Bug 4385

Description

An example under fn:idref is incorrectly formatted

History

5 Apr 2007: Proposed

Change

In 15.5.3 fn:idref (first numbered list, second item, first bulleted list, first item, first bulleted list, second item, first code section):

Replace the text:

fn:tokenize(fn:normalize-space($N),
                                                  ' ')

By:

fn:tokenize(fn:normalize-space($N), ' ')

FO.E2 - editorial

See Bug 4384

Description

The description of fn:subsequence contains a spurious variable $p

History

3 Apr 2007: Proposed

Changes

  1. In 15.1.10 fn:subsequence (first code section):

    Replace the text:

    $sourceSeq[fn:round($startingLoc) le $p]

    By:

    $sourceSeq[fn:round($startingLoc) le position()]
  2. In 15.1.10 fn:subsequence (second code section):

    Replace the text:

    $sourceSeq[fn:round($startingLoc) le $p 
         and $p lt fn:round($startingLoc) + fn:round($length)]

    By:

    $sourceSeq[fn:round($startingLoc) le position() 
        and position() lt fn:round($startingLoc) + fn:round($length)]

FO.E1 - substantive

See Bug 4373

Description

In fn:resolve-uri it is unclear what happens when the supplied base URI is a relative reference

History

3 Apr 2007: Proposed

Change

In 8.1 fn:resolve-uri (starting at first paragraph):

Replace the text:

Summary: The purpose of this function is to enable a relative URI to be resolved against an absolute URI.

The first form of this function resolves $relative against the value of the base-uri property from the static context. If the base-uri property is not initialized in the static context an error is raised [err:FONS0005].

If $relative is a relative URI reference, it is resolved against $base, or the base-uri property from the static context, using an algorithm such as the ones described in [RFC 2396] or [RFC 3986] , and the resulting absolute URI reference is returned. An error may be raised [err:FORG0009] in the resolution process.

If $relative is an absolute URI reference, it is returned unchanged.

If $relative or $base is not a valid xs:anyURI an error is raised [err:FORG0002].

If $relative is the empty sequence, the empty sequence is returned.

Note:

Resolving a URI does not dereference it. This is merely a syntactic operation on two character strings.

By:

Summary: This function enables a relative URI reference to be resolved against an absolute URI.

The first form of this function resolves $relative against the value of the base-uri property from the static context. If the base-uri property is not initialized in the static context an error is raised [err:FONS0005].

If $relative is a relative URI reference, it is resolved against $base, or against the base-uri property from the static context, using an algorithm such as those described in [RFC 2396] or [RFC 3986] , and the resulting absolute URI reference is returned.

If $relative is an absolute URI reference, it is returned unchanged.

If $relative is the empty sequence, the empty sequence is returned.

If $relative is not a valid URI according to the rules of the xs:anyURI data type, or if it is not a suitable relative reference to use as input to the chosen resolution algorithm, then an error is raised [err:FORG0002].

If $base is not a valid URI according to the rules of the xs:anyURI data type, if it is not a suitable URI to use as input to the chosen resolution algorithm (for example, if it is a relative URI reference, if it is a non-hierarchic URI, or if it contains a fragment identifier), then an error is raised [err:FORG0002].

If the chosen resolution algorithm fails for any other reason then an error is raised [err:FORG0009].

Note:

Resolving a URI does not dereference it. This is merely a syntactic operation on two character strings.

Note:

The algorithms in the cited RFCs include some variations that are optional or recommended rather than mandatory; they also describe some common practices that are not recommended, but which are permitted for backwards compatibility. Where the cited RFCs permit variations in behavior, so does this specification.


Index by affected section

7.4.5 fn:normalize-space

FO.E14

7.4.11 fn:iri-to-uri

FO.E8

7.6.1 Regular Expression Syntax

FO.E4

7.6.1.1 Flags

FO.E7

8.1 fn:resolve-uri

FO.E1

10.6.3 op:multiply-yearMonthDuration

FO.E12

10.6.4 op:divide-yearMonthDuration

FO.E12

10.8.1 op:subtract-dateTimes

FO.E5

10.8.2 op:subtract-dates

FO.E5

14.3 fn:namespace-uri

FO.E15

14.5 fn:lang

FO.E16

15.1.10 fn:subsequence

FO.E2

15.5.2 fn:id

FO.E13

15.5.3 fn:idref

FO.E3 FO.E13

17.1.1 Casting from xs:string and xs:untypedAtomic

FO.E11

17.1.2 Casting to xs:string and xs:untypedAtomic

FO.E10

17.1.5 Casting to date and time types

FO.E6

D Compatibility with XPath 1.0 (Non-Normative)

FO.E9

Index by Bugzilla entry

Bug #4106: FO.E4

Bug #4373: FO.E1

Bug #4384: FO.E2

Bug #4385: FO.E3

Bug #4448: FO.E5

Bug #4471: FO.E6

Bug #4519: FO.E13

Bug #4543: FO.E7

Bug #4545: FO.E8

Bug #4549: FO.E9

Bug #4621: FO.E12

Bug #4634: FO.E4

Bug #4874: FO.E10 FO.E11

Bug #4974: FO.E14

Bug #5235: FO.E15

Bug #5246: FO.E16

Index by function

fn:id: FO.E13

fn:idref: FO.E3 FO.E13

fn:iri-to-uri: FO.E8

fn:lang: FO.E16

fn:matches: FO.E4 FO.E7

fn:namespace-uri: FO.E15

fn:normalize-space: FO.E14

fn:replace: FO.E4 FO.E7

fn:resolve-uri: FO.E1

fn:string: FO.E6

fn:subsequence: FO.E2

fn:tokenize: FO.E4 FO.E7

fn:translate: FO.E9

op:divide-yearMonthDuration: FO.E12

op:multiply-yearMonthDuration: FO.E12

op:subtract-dateTimes: FO.E5

op:subtract-dates: FO.E5

op:subtract-times: FO.E5

Index by error-code

FONS0005: FO.E1

FORG0002: FO.E1

FORG0009: FO.E1

XPDY0002: FO.E14

XPTY0004: FO.E11

Index by type

xs:ID: FO.E13

xs:IDREF: FO.E13

xs:IDREFS: FO.E13

xs:NOTATION: FO.E10 FO.E11

xs:QName: FO.E11

xs:date: FO.E6

xs:dateTime: FO.E6

xs:gDay: FO.E6

xs:gMonth: FO.E6

xs:gMonthDay: FO.E6

xs:gYear: FO.E6

xs:gYearMonth: FO.E6

xs:string: FO.E6

xs:time: FO.E6

xs:yearMonthDuration: FO.E12