W3C

XQuery 1.0 and XPath 2.0 Functions and Operators

W3C Working Draft 30 April 2002

This version:
http://www.w3.org/TR/2002/WD-xquery-operators-20020430/
Latest version:
http://www.w3.org/TR/xquery-operators/
Previous version:
http://www.w3.org/TR/2001/WD-xquery-operators-20011220/
Editors:
Ashok Malhotra (XML Query and XSL WGs), Microsoft <ashokma@microsoft.com>
Jim Melton (XML Query WG), Oracle Corp <jim.melton@acm.org>
Jonathan Robie (XML Query WG), Software AG <Jonathan.Robie@SoftwareAG-USA.com>
Norman Walsh (XSL WG), Sun Microsystems <Norman.Walsh@Sun.COM>

Abstract

This document defines basic operators and functions on the datatypes defined in [XML Schema Part 2: Datatypes] for use in XQuery, XPath, XSLT and other related XML standards. It also discusses operators and functions on nodes and node sequences as defined in the [XQuery 1.0 and XPath 2.0 Data Model] for use in XQuery, XPath, XSLT and other related XML standards.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this document series is maintained at the W3C.

This is a Public Working Draft of this document for review by W3C Members and other interested parties. It is a draft document and may be updated, replaced or made obsolete by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". This is work in progress and does not imply endorsement by the W3C membership.

This document describes constructors, operators, and functions that are used in [XPath 2.0], [XQuery 1.0: An XML Query Language] and [XSLT 2.0]. The document is generally unconcerned with the specific syntax with which these constructors, operators, and functions will be used, and focuses instead on defining the semantics of them as precisely as feasible.

Perhaps the largest change from previous versions of this document appears in section 6 Constructors, Functions, and Operators on Durations, Dates and Times. The duration datatype defined in [XML Schema Part 2: Datatypes] is partially ordered. This makes it difficult to work with. In light of this the Working Groups decided to limit support for the duration datatype and define two new, totally ordered, subtypes of duration called yearMonthDuration and dayTimeDuration. These dataypes are defined in section 6.2 Two Totally Ordered Subtypes of Duration and a full complement of functions is defined on them in the following sections.

Another significant change that the Working Groups have been discussing but have not finalized is the treatment of the [XML Schema Part 2: Datatypes] date and time types: dateTime, date and time as well as gYear, gMonth, gDay, gYearMonth and gMonthDay. These types also have partial orders due to the fact that they may or may not contain time zones. (See issue 150 below.) A solution to this problem is to postulate that dateTime, date and time values that do not have a time zone are, for purposes of comparison, given an implementation defined time zone. If we adopt this solution we can define constructors, equality and comparison functions and casting on these datatypes as well as helper functions to add and remove time zones, etc. For the other five dataypes: gYear, gMonth, gDay, gYearMonth and gMonthDay we are considering adopting a similar solution but limiting the support to only constructors, equality and, possibly, casting.

There is also ongoing discussion about whether the syntax for constructors and casting can be unified. (See issue 16 below.) This has potentially wide impact and may reopen the closed issue of casting to and from derived types. For this reason section 14 Casting Functions is unchanged from the previous version of this document.

This document has been produced following the procedures set out for the W3C Process. This document was produced through the efforts of a joint task force of the W3C XML Query Working Group and the W3C XML Schema Working Group (both part of the W3C XML Activity) and a second joint task force of the W3C XML Query Working Group and the W3C XSL Working Group (part of the W3C Style Activity). It is designed to be read in conjunction with the following documents: [XQuery 1.0 and XPath 2.0 Data Model], [XPath 2.0], [XQuery 1.0: An XML Query Language] and [XSLT 2.0].

The following are identified as high priority issues. Reviewers are requested to provide feedback on these issues using the address below.

[Issue 16: Is a constructor more than a different syntax for CAST?]

[Issue 139: Need a fuller treatment of error behaviour and possibly error handling.]

[Issue 145: Need decisions and text in several of our documents detailing conformance requirements based on resource limitations.]

[Issue 150: Should we support comparisons of date/time types that return indeterminate results?]

A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR/.

Comments on this document should be sent to the W3C mailing list www-xml-query-comments@w3.org (archived at http://lists.w3.org/Archives/Public/www-xml-query-comments/).

Table of Contents

1 Introduction
    1.1 Terminology
    1.2 Datatypes
    1.3 Syntax
    1.4 Notations
    1.5 Namespace Prefix
2 Accessors
    2.1 xf:node-kind
    2.2 xf:node-name
    2.3 xf:string
    2.4 xf:data
    2.5 xf:base-uri
    2.6 xf:unique-ID
3 Constructors, Functions, and Operators on Numbers
    3.1 Numeric Types
    3.2 Numeric Constructors
        3.2.1 xf:decimal
        3.2.2 xf:integer
        3.2.3 xf:long
        3.2.4 xf:int
        3.2.5 xf:short
        3.2.6 xf:byte
        3.2.7 xf:float
        3.2.8 xf:double
    3.3 Operators on Numeric Values
        3.3.1 op:numeric-add
        3.3.2 op:numeric-subtract
        3.3.3 op:numeric-multiply
        3.3.4 op:numeric-divide
        3.3.5 op:numeric-mod
        3.3.6 op:numeric-unary-plus
        3.3.7 op:numeric-unary-minus
    3.4 Comparisons of Numeric Values
        3.4.1 op:numeric-equal
        3.4.2 op:numeric-less-than
        3.4.3 op:numeric-greater-than
    3.5 Functions on Numeric Values
        3.5.1 xf:floor
        3.5.2 xf:ceiling
        3.5.3 xf:round
4 Constructors and Functions on Strings
    4.1 String Types
    4.2 String Constructors
        4.2.1 xf:string
        4.2.2 xf:normalizedString
        4.2.3 xf:token
        4.2.4 xf:language
        4.2.5 xf:Name
        4.2.6 xf:NMTOKEN
        4.2.7 xf:NCName
        4.2.8 xf:ID
        4.2.9 xf:IDREF
        4.2.10 xf:ENTITY
    4.3 Equality and Comparison of Strings
        4.3.1 xf:compare
    4.4 Functions on String Values
        4.4.1 Usage Notes
        4.4.2 xf:concat
        4.4.3 xf:starts-with
        4.4.4 xf:ends-with
        4.4.5 xf:contains
        4.4.6 xf:substring
        4.4.7 xf:string-length
        4.4.8 xf:substring-before
        4.4.9 xf:substring-after
        4.4.10 xf:normalize-space
        4.4.11 xf:normalize-unicode
        4.4.12 xf:upper-case
        4.4.13 xf:lower-case
        4.4.14 xf:translate
        4.4.15 xf:string-pad
        4.4.16 xf:match
        4.4.17 xf:replace
5 Constructors, Functions and Operators on Booleans
    5.1 Boolean Constructors
        5.1.1 xf:true
        5.1.2 xf:false
        5.1.3 xf:boolean-from-string
    5.2 Operators on Boolean Values
        5.2.1 op:boolean-equal
    5.3 Functions on Boolean Values
        5.3.1 xf:not
        5.3.2 op:boolean-less-than
        5.3.3 op:boolean-greater-than
6 Constructors, Functions, and Operators on Durations, Dates and Times
    6.1 Duration, Date and Time Types
    6.2 Two Totally Ordered Subtypes of Duration
        6.2.1 yearMonthDuration
        6.2.2 dayTimeDuration
    6.3 Duration, Date and Time Constructors
        6.3.1 xf:duration
        6.3.2 xf:dateTime
        6.3.3 xf:date
        6.3.4 xf:time
        6.3.5 xf:gYearMonth
        6.3.6 xf:gYear
        6.3.7 xf:gMonthDay
        6.3.8 xf:gMonth
        6.3.9 xf:gDay
    6.4 yearMonthDuration and dayTimeDuration Constructors
        6.4.1 xf:yearMonthDuration
        6.4.2 xf:yearMonthDuration-from-months
        6.4.3 xf:dayTimeDuration
        6.4.4 xf:dayTimeDuration-from-seconds
    6.5 Comparisons of Duration, Date and Time Values
        6.5.1 op:duration-equal
        6.5.2 op:yearMonthDuration-equal
        6.5.3 op:yearMonthDuration-less-than
        6.5.4 op:yearMonthDuration-greater-than
        6.5.5 op:dayTimeDuration-equal
        6.5.6 op:dayTimeDuration-less-than
        6.5.7 op:dayTimeDuration-greater-than
        6.5.8 op:datetime-equal
        6.5.9 op:datetime-less-than
        6.5.10 op:datetime-greater-than
    6.6 Component Extraction Functions on Duration, Date and Time Values
        6.6.1 xf:get-years-from-yearMonthDuration
        6.6.2 xf:get-months-from-yearMonthDuration
        6.6.3 xf:get-days-from-dayTimeDuration
        6.6.4 xf:get-hours-from-dayTimeDuration
        6.6.5 xf:get-minutes-from-dayTimeDuration
        6.6.6 xf:get-seconds-from-dayTimeDuration
        6.6.7 xf:get-year-from-dateTime
        6.6.8 xf:get-year-from-date
        6.6.9 xf:get-month-from-dateTime
        6.6.10 xf:get-month-from-date
        6.6.11 xf:get-day-from-dateTime
        6.6.12 xf:get-day-from-date
        6.6.13 xf:get-hours-from-dateTime
        6.6.14 xf:get-hours-from-time
        6.6.15 xf:get-minutes-from-dateTime
        6.6.16 xf:get-minutes-from-time
        6.6.17 xf:get-seconds-from-dateTime
        6.6.18 xf:get-seconds-from-time
        6.6.19 xf:get-timezone-from-dateTime
        6.6.20 xf:get-timezone-from-date
        6.6.21 xf:get-timezone-from-time
    6.7 Arithmetic Functions on yearMonthDuration and dayTimeDuration
        6.7.1 op:add-yearMonthDurations
        6.7.2 op:subtract-yearMonthDurations
        6.7.3 op:multiply-yearMonthDuration
        6.7.4 op:divide-yearMonthDuration
        6.7.5 op:add-dayTimeDurations
        6.7.6 op:subtract-dayTimeDurations
        6.7.7 op:multiply-dayTimeDuration
        6.7.8 op:divide-dayTimeDuration
    6.8 Arithmetic Functions on Dates
        6.8.1 xf:add-days
    6.9 Functions and Operators on TimePeriod Values
        6.9.1 xf:get-yearMonthDuration
        6.9.2 xf:get-dayTimeDuration
        6.9.3 op:add-yearMonthDuration-to-dateTime
        6.9.4 op:add-dayTimeDuration-to-dateTime
        6.9.5 op:subtract-yearMonthDuration-from-dateTime
        6.9.6 op:subtract-dayTimeDuration-from-dateTime
        6.9.7 op:add-yearMonthDuration-to-date
        6.9.8 op:add-dayTimeDuration-to-date
        6.9.9 op:subtract-yearMonthDuration-from-date
        6.9.10 op:subtract-dayTimeDuration-from-date
        6.9.11 op:add-dayTimeDuration-to-time
        6.9.12 op:subtract-dayTimeDuration-from-time
7 Constructors and Functions on QNames
    7.1 Constructors for QNames
        7.1.1 xf:QName-from-uri
        7.1.2 xf:QName-from-string
        7.1.3 xf:QName
    7.2 Functions on QNames
        7.2.1 op:QName-equal
        7.2.2 xf:get-local-name
        7.2.3 xf:get-namespace-uri
        7.2.4 xf:namespace-uri
8 Constructors, Functions, and Operators for anyURI
    8.1 Constructors for anyURI
        8.1.1 xf:anyURI
        8.1.2 xf:resolve-URI
    8.2 Functions on anyURI
        8.2.1 op:anyURI-equal
9 Functions and Operators on base64Binary and hexBinary
    9.1 Comparisons of base64Binary and hexBinary Values
        9.1.1 op:hex-binary-equal
        9.1.2 op:base64-binary-equal
10 Constructors, Functions, and Operators on NOTATION
    10.1 NOTATION Constructor
        10.1.1 xf:NOTATION
    10.2 Functions on NOTATION
        10.2.1 op:NOTATION-equal
11 Functions and Operators on Nodes
    11.1 Functions and Operators on Nodes
        11.1.1 xf:name
        11.1.2 xf:local-name
        11.1.3 xf:number
        11.1.4 xf:lang
        11.1.5 op:node-equal
        11.1.6 xf:deep-equal
        11.1.7 op:node-before
        11.1.8 op:node-after
        11.1.9 op:node-precedes
        11.1.10 op:node-follows
        11.1.11 xf:copy
        11.1.12 xf:shallow
        11.1.13 xf:root
    11.2 if-absent() and if-empty()
        11.2.1 xf:if-absent
        11.2.2 xf:if-empty
12 Constructors, Functions, and Operators on Sequences
    12.1 Constructors on Sequences
        12.1.1 op:to
    12.2 Functions and Operators on Sequences
        12.2.1 xf:boolean
        12.2.2 op:concatenate
        12.2.3 op:item-at
        12.2.4 xf:index-of
        12.2.5 xf:empty
        12.2.6 xf:exists
        12.2.7 xf:distinct-nodes
        12.2.8 xf:distinct-values
        12.2.9 xf:insert
        12.2.10 xf:remove
        12.2.11 xf:subsequence
    12.3 Equals, Union, Intersection and Except
        12.3.1 xf:sequence-deep-equal
        12.3.2 xf:sequence-node-equal
        12.3.3 op:union
        12.3.4 op:intersect
        12.3.5 op:except
    12.4 Aggregate Functions
        12.4.1 xf:count
        12.4.2 xf:avg
        12.4.3 xf:max
        12.4.4 xf:min
        12.4.5 xf:sum
    12.5 Functions that Generate Sequences
        12.5.1 xf:id
        12.5.2 xf:idref
        12.5.3 xf:filter
        12.5.4 xf:document
        12.5.5 xf:collection
        12.5.6 xf:input
13 Context Functions
    13.1 xf:context-item
    13.2 xf:position
    13.3 xf:last
    13.4 op:context-document
    13.5 xf:current-dateTime
        13.5.1 Examples
14 Casting Functions
    14.1 Casting to primitive types from primitive types
    14.2 Casting from derived types to primitive types
    14.3 Casting to string
    14.4 Casting to numeric types
    14.5 Casting to duration and date and time types
    14.6 Casting to boolean
    14.7 Casting to base64Binary, hexBinary
    14.8 Casting to anyURI, QName and NOTATION

Appendices

A References
    A.1 Normative
    A.2 Non-normative
B Functions and Operators Issues List (Non-Normative)
C ChangeLog (Non-Normative)
D Function and Operator Quick Reference (Non-Normative)
    D.1 Functions and Operators by Section
    D.2 Functions and Operators Alphabetically


1 Introduction

[XML Schema Part 2: Datatypes] defines a number of primitive and derived datatypes, collectively known as built-in datatypes. This document defines operations on those datatypes for use in XQuery, XPath, XSLT and related XML standards. This document also discusses operators and functions on nodes and node sequences as defined in the [XQuery 1.0 and XPath 2.0 Data Model] for use in XQuery, XPath, XSLT and other related XML standards.

1.1 Terminology

The terminology used to describe the functions and operators on [XML Schema Part 2: Datatypes] is defined in the body of this specification. The terms defined in the following list are used in building those definitions:

[Definition] for compatibility

A feature of this specification included to ensure that implementations that use this feature remain compatible with [XPath 1.0]

[Definition] may

Conforming documents and processors are permitted to but need not behave as described.

[Definition] must

Conforming documents and processors are required to behave as described; otherwise they are non-conformant or in error.

[Definition] implementation defined

Possibly differing between implementations, but specified by the implementor for each particular implementation.

[Definition] implementation dependent

Possibly differing between implementations, but not specified by this or other W3C specification, and not required to be specified by the implementor for any particular implementation.

1.2 Datatypes

The diagram below shows the built-in [XML Schema Part 2: Datatypes]. Solid lines connect a base datatype above to a derived datatype below. Dashed lines connect a datatype created as a list of an item type above.

Type hierarchy graphic

Diagram courtesy Asir Vedamuthu, webMethods

1.3 Syntax

The purpose of this document is to catalog the functions and operators required for XPath 2.0, XML Query 1.0 and XSLT 2.0. The exact syntax used to invoke these functions and operators is specified in [XPath 2.0], [XQuery 1.0: An XML Query Language] and [XSLT 2.0].

In general, the above specifications do not support function overloading. Consequently, there are no overloaded functions in this document except for legacy [XPath 1.0] functions such as string() which takes a single argument of a variety of types and concat() which takes a variable number of string arguments. This does not apply to operators such as "+" which may be overloaded. Functions with optional arguments are allowed. If optional arguments are omitted, omissions are assumed to begin from the right.

1.4 Notations

This document defines, among other things, constructors and other functions that apply to one or more data types. Each constructor and function is defined by specifying its signature, a description of each of its arguments, and its semantics. For many constructors and functions, examples are included to illustrate their use.

Each function's signature is presented in a form like this:

xf:function-name(parameter-type $parameter-name, ...) => return-type

In this notation, function-name is the name of the function whose signature is being specified. If the function takes no parameters, then the name is followed by an empty set of parentheses: (); otherwise, the name is followed by a parenthesized list of parameter declarations, each declaration specifying the static type of the parameter and a non-normative name used to reference the parameter when the function's semantics are specified. If there are two or more parameter declarations, they are separated by a comma. The return-type specifies the static type of the value returned by the function.

The function name is a QName as defined in [XML 1.0 Recommendation (Second Edition)] and must adhere to it's syntatic conventions. Following [XPath 1.0], function names are composed of English words separated by hyphens,"-". If a function name contains a [XML Schema Part 2: Datatypes] datatype name, this may have intercapitalized spelling and is used in the function name as such. For example, xf:get-timezone-from-dateTime.

As is customary, the parameter type name indicates that the function accepts arguments of that type in that position. If the parameter type name is one of the simple types defined in [XML Schema Part 2: Datatypes] the function also accepts arguments with types derived from that type. These may be one of the derived types in [XML Schema Part 2: Datatypes] or they may be user-derived types.

Some functions accept the empty sequence as an argument and some may return the empty sequence. This is indicated in the function signature by following the parameter type name with a question mark:

xf:function-name(parameter-type? $parameter-name) => return-type?

The names of constructor functions have been chosen so that their local names are "spelled the same" as the local names of the types for which they are constructors. For example, the name of the constructor function that constructs values whose type is xsd:decimal is xf:decimal. Throughout this document, we typically omit the prefix xsd: in the names of XML Schema types.

[Issue 133: Syntax for indicating that function accepts empty sequence is incorrect]

1.5 Namespace Prefix

The functions and operators discussed in this document are contained in two namespaces and referenced using a qualified name. The namespace prefix used in this document—merely for illustrative purposes—is xf: for the user functions and op: for the operator functions. The namespace prefix for these functions can vary, as long as the prefix is bound to the currect URI.

The actual namespaces (that is, the URI of the namespaces) are:

  • http://www.w3.org/2002/04/xquery-operators for operators

  • http://www.w3.org/2002/04/xquery-functions for functions.

The functions defined with a xf: prefix are callable by the user. Functions defined with the op: prefix are described here to underpin the definitions of the operators in [XPath 2.0], [XQuery 1.0: An XML Query Language] and [XSLT 2.0]. These functions are not available directly to users, and there is no requirement that implementations should actually provide these functions. For example, multiplication is generally associated with the * operator, but it is described as a function in this document. For example:

op:multiply(numeric $operand1, numeric $operand2) => numeric

2 Accessors

The [XQuery 1.0 and XPath 2.0 Data Model] describes accessors on different types of nodes and defines their semantics. Some of these accessors are exposed to the userthrough the functions described below.

FunctionAccessorAcceptsReturns
xf:node-kindnode-kindany kind of nodestring
xf:node-namenameany kind of nodezero or one QName
xf:stringstring-valuea sequence, a node of any kind, or a simple valuestring
xf:datatyped-valueany kind of nodea typed sequence of atomic values
xf:base-uribase-uriElement or Document nodezero or one anyURI
xf:unique-IDunique-IDElement nodezero or one ID

2.1 xf:node-kind

xf:node-kind(node $srcval) => string

This function returns a string value representing the node's kind: either "document", "element", "attribute", "text", "namespace", "processing-instruction", or "comment".

2.2 xf:node-name

xf:node-name(node $srcval) => QName?

This function returns an expanded QName for node kinds that can have names. For other node kinds, it returns the empty sequence. Expanded QName is defined in [XQuery 1.0 and XPath 2.0 Data Model], and consists of a namespace URI and a local name.

2.3 xf:string

xf:string() => string
xf:string(item* $srcval) => string

Returns the value of $srcval represented as a string. If no argument is supplied, $srcval defaults to the context item (.).

If $srcval is the empty sequence, the empty string is returned.

If $srcval is a node, the function returns the string value of the node, as obtained using the string-value accessor defined in the [XQuery 1.0 and XPath 2.0 Data Model].

If $srcval is a simple value, the function returns the same string as is returned by the expression cast as xs:string ($srcval), except in the cases listed below:

  • If the type of $srcval is xs:decimal, and the value is equal to an integer, then the function returns the canonical representation of that integer.

  • If the type of $srcval is xs:anyURI, the URI is converted to a string without any escaping of special characters.

NOTE: The reason for the special rule for xs:decimal is backwards compatibility with XPath 1.0. Many simple arithmetic operations such as 10 div 5 will give a decimal result. At XPath 1.0, such operations gave a result of type "number". Converting such a number to a string would display it without a decimal point if the actual value was equal to an integer.

NOTE: The reason for the special rule for xs:anyURI is that, although XML Schema strongly discourages the use of spaces within URI values, the escaping of spaces can cause problems with legacy applications (for example, this applies to spaces within fragment identifiers in many HTML browsers), and should therefore be under user control.

NOTE: The string representation of double values is not backwards-compatible with the representation of number values in XPath 1.0. Ordinary double values are now represented using scientific notation; the representations of positive and negative infinity are now INF and -INF rather than Infinity and -Infinity; and negative zero is represented as -0 rather than 0. However, most expressions that would have produced a number in XPath 1.0 will produce a decimal (or integer) in XPath 2.0, so unless there is a loss of precision caused by numeric approximation, the result of the expression will in most simple cases be the same after conversion to a string.

NOTE: If a sequence of more than one item is supplied as the argument to the function, a type exception occurs. If fallback conversion rules are in use, this means that the function is applied in effect to the first item in the supplied sequence, and remaining items are ignored.

[Issue 160: Align the string() function with 'cast as string'.]

2.4 xf:data

xf:data(node* $srcval) => value*

Returns the typed-value of each node in $srcval. Each node in $srcval is processed as follows.

The static type of the result for each node is determined by the static type of the value that is extracted.

If $srcval is not a node, returns the error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].

If $srcval is a document, namespace, comment or processing instruction node, returns the the error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].

If $srcval is a text node, returns the string content of the text node with type annotation xs:anySimpleType.

If $srcval is an attribute node defined to have xs:anySimpleType, or an element node with complex type and simple content defined to have xs:anySimpleType, returns its string value with type annotation xs:anySimpleType.

If $srcval is an element node defined to have xs:anyType, returns its string value with type xs:anySimpleType as obtained by concatenating the descendent text nodes in document order.

If $srcval is an element or attribute node with a simple type other than xs:anySimpleType or with a complex type with simple content other than xs:anySimpleType, returns the node's typed value which is a sequence of atomic values. For example:

  • N is an element node of type hatsizelist, which is a complex type that includes a country attribute. The content of the type hatsizelist is a sequence of items of type hatsize, which is derived from xs:decimal. In XML Schema, this content is considered to have a simple type. The typed value of N is a sequence of values of type hatsize.

  • A is an attribute of type IDREFS, a list type derived from IDREF, and its content is "bar baz faz". The typed value of A is a sequence of three atomic values of type IDREF.

If $srcval is an element node defined to have a complex type other than xs:anyType, returns the error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].

[Issue 164: For a complex type other than anyType, should data() return the string value obtained by concatenating descendant text nodes?]

2.5 xf:base-uri

xf:base-uri(node $srcval) => anyURI?

For document and element nodes this function returns the value of the base-uri property. For other kinds of node it returns the empty sequence.

2.6 xf:unique-ID

xf:unique-ID(node $srcval) => ID?

This function accepts an element node and returns the identifier (ID) which may have been assigned by the user. It corresponds to the normalized value property of the attribute information item in the attributes property that has a type ID, if one exists. If no ID attribute exists the empty sequnce is returned.

3 Constructors, Functions, and Operators on Numbers

This section discusses arithmetic operators on the numeric datatypes defined in [XML Schema Part 2: Datatypes]. It uses an approach that permits lightweight operations whenever possible.

3.1 Numeric Types

The operators described in this section are defined on the following numeric types.

decimal
integer
int
long
short
byte
float
double

They also apply to user-defined types derived by restriction from these types.

3.2 Numeric Constructors

The following constructors are defined on the above numeric types. Each constructor takes a single string literal as argument. Leading and trailing whitespace, if present, is stripped from the literal before the value is constructed.

[Issue 149: Either the constructor functions should allow dynamic expressions or the syntax should be changed so that they do not appear to be functions. ]

ConstructorMeaning
xf:decimal Produces a decimal value by parsing and interpreting a string.
xf:integer Produces an integer value by parsing and interpreting a string.
xf:long Produces a long value by parsing and interpreting a string.
xf:int Produces an int value by parsing and interpreting a string.
xf:short Produces a short value by parsing and interpreting a string.
xf:byte Produces a byte value by parsing and interpreting a string.
xf:float Produces a float value by parsing and interpreting a string.
xf:double Produces a double value by parsing and interpreting a string.

For float and double, the string argument can indicate the special values: NaN, INF, -INF, +0, and -0.

If the argument string passed to a constructor results in an error (for example, if it conatains a letter other than "E" or "e" or a string other than the special values named above), the constructor returns the error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].

3.2.1 xf:decimal

xf:decimal(string $srcval) => decimal

Returns the decimal value that is represented by the characters contained in the value of $srcval. For this constructor, $srcval must be a string literal.

If the value of $srcval is not a valid lexical representation for the decimal type as specified in [XML Schema Part 2: Datatypes], then the error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].

If the number of characters contained in the value of $srcval that are digits is greater than the maximum number of decimal digits supported by the implementation, then the error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].

3.2.1.1 Examples
  • xf:decimal('123.5') returns the decimal value corresponding to one hundred twenty three and one-half.

  • xf:decimal('12.5E2') returns the error value, since the use of the letter "E" is prohibited in the constructor for the decimal type.

    xf:decimal(' 12.5 ') returns the decimal value corresponding to twelve and one-half.

3.2.2 xf:integer

xf:integer(string $srcval) => integer

Returns the integer value that is represented by the characters contained in the value of $srcval. For this constructor, $srcval must be a string literal.

If the value of $srcval is not a valid lexical representation for the integer type as specified in [XML Schema Part 2: Datatypes], then the error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].

If the number of characters contained in the value of $srcval that are digits is greater than the maximum number of digits supported by the implementation, then the error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].

3.2.2.1 Examples
  • xf:integer('-123') returns the integer value corresponding to negative one hundred twenty three.

  • xf:integer('123.5') returns the error value, since the use of a decimal point is prohibited in the constructor for the integer type.

3.2.3 xf:long

xf:long(string $srcval) => integer

Returns the long value that is represented by the characters contained in the value of $srcval. For this constructor, $srcval must be a string literal.

If the value of $srcval is not a valid lexical representation for the long type as specified in [XML Schema Part 2: Datatypes], then the error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].

If the value of the number corresponding to the characters contained in the value of $srcval is greater than 9,223,372,036,854,775,807 or less than -9,223,372,036,854,775,808, then the error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].

3.2.3.1 Examples
  • xf:long('-1235') returns the long value corresponding to negative one thousand two hundred thirty five.

  • xf:long('10000000000000000000') returns the error value, since ten quintillion is not a valid value for the long type.

3.2.4 xf:int

xf:int(string $srcval) => integer

Returns the int value that is represented by the characters contained in the value of $srcval. For this constructor, $srcval must be a string literal.

If the value of $srcval is not a valid lexical representation for the int type as specified in [XML Schema Part 2: Datatypes], then the error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].

If the value of the number corresponding to the characters contained in the value of $srcval is greater than 2,147,483,647 or less than -2,147,483,648, then the error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].

3.2.4.1 Examples
  • xf:int('1235') returns the int value corresponding to one thousand two hundred thirty five.

  • xf:int('2147483648') returns the error value, since the value two billion, one hundred forty seven million, four hundred eighty three thousand, six hundred forty eight is not a valid value for the short type.

3.2.5 xf:short

xf:short(string $srcval) => integer

Returns the short value that is represented by the characters contained in the value of $srcval. For this constructor, $srcval must be a string literal.

If the value of $srcval is not a valid lexical representation for the short type as specified in [XML Schema Part 2: Datatypes], then the error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].

If the value of the number corresponding to the characters contained in the value of $srcval is greater than 32,767 or less than -32,768, then the error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].

3.2.5.1 Examples
  • xf:short('1235') returns the short value corresponding to one thousand two hundred thirty five.

  • xf:short('32768') returns the error value, since the value thirty two thousand seven hundred sixty eight is not a valid value for the short type.

3.2.6 xf:byte

xf:byte(string $srcval) => integer

Returns the byte value that is represented by the characters contained in the value of $srcval. For this constructor, $srcval must be a string literal.

If the value of $srcval is not a valid lexical representation for the byte type as specified in [XML Schema Part 2: Datatypes], then the error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].

If the value of the number corresponding to the characters contained in the value of $srcval is greater than 127 or less than -128, then the error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].

3.2.6.1 Examples
  • xf:byte('127') returns the byte value corresponding to one hundred twenty seven.

  • xf:byte('128') returns the error value, since the value one hundred twenty eight is not a valid value for the byte type.

3.2.7 xf:float

xf:float(string $srcval) => float

Returns the float value that is represented by the characters contained in the value of $srcval. For this constructor, $srcval must be a string literal.

If the value of $srcval is "NaN", then the constructor returns the "not-a-number" value.

If the value of $srcval is "INF" or "+INF", then the constructor returns the "positive infinity" value. If the value of $srcval is "-INF", then the constructor returns the "negative infinity" value.

If the value of $srcval is "0" or "+0", then the constructor returns the value positive zero. If the value of $srcval is "-0", then the constructor returns the value negative zero.

If the value of $srcval is not a valid lexical representation for the float type as specified in [XML Schema Part 2: Datatypes], then the error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].

If the value of the number corresponding to the characters contained in the value of $srcval is not a valid value for the float type, then the error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].

3.2.7.1 Examples
  • xf:float('510E2') returns the float value corresponding to fifty one thousand.

  • xf:float('15.25') returns the float value corresponding to fifteen and a quarter.

  • xf:float('51D1') returns the error value, since the use of the letter "D" is prohibited in the constructor for the float type.

3.2.8 xf:double

xf:double(string $srcval) => double

Returns the double value that is represented by the characters contained in the value of $srcval. For this constructor, $srcval must be a string literal.

If the value of $srcval is "NaN", then the constructor returns the "not-a-number" value.

If the value of $srcval is "INF" or "+INF", then the constructor returns the "positive infinity" value. If the value of $srcval is "-INF", then the constructor returns the "negative infinity" value.

If the value of $srcval is "0" or "+0", then the constructor returns the value positive zero. If the value of $srcval is "-0", then the constructor returns the value negative zero.

If the value of $srcval is not a valid lexical representation for the double type as specified in [XML Schema Part 2: Datatypes], then the error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].

If the value of the number corresponding to the characters contained in the value of $srcval is not a valid value for the float type, then the error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].

3.2.8.1 Examples
  • xf:double('510E2') returns the double value corresponding to fifty one thousand.

  • xf:double('15.25') returns the double value corresponding to fifteen and a quarter.

  • xf:double('51D1') returns the error value, since the use of the letter "D" is prohibited in the constructor for the double type.

3.3 Operators on Numeric Values

The arguments and return types for the arithmetic operators are the basic numeric types: decimal, float, and double and types derived from them. For simplicity, each operator is defined to operate on operands of the same datatype and to return the same datatype. If the two operands are not of the same datatype, one operand is promoted to be the type of the other operand.

The type promotion scheme includes only two rules:

  1. A derived type may be promoted its base type.

  2. decimal may be promoted to float, and float may be promoted to double.

The result type of operations depends on their argument datatypes and is defined in the following table:

OperatorReturns
op:operation(decimal, decimal)decimal
op:operation(float, float)float
op:operation(double, double)double
op:operation(decimal)decimal
op:operation(float)float
op:operation(double)double

These rules define any operation on any pair of arithmetic types. Consider the following example:

op:operation(int, double) => op:operation(double, double)

For this operation, int must be converted to double. This can be done, since by the rules above: int can be promoted to integer, integer can be promoted to decimal, decimal can be promoted to float, and float can be promoted to double. As far as possible, the promotions should be done in a single step. Specifically, when a decimal is promoted to a double, it must not be converted to a float and then to double as this will lose precision.

As another example, a user may define height as a derived type of integer with a minimum value of 20 and a maximum value of 100. He may then derive oddHeight using a pattern to restrict the value to odd integers.

op:operation(oddHeight, 2) => op:operation(decimal, decimal)

oddHeight is first promoted to it's base type height. height is promoted to its base type integer and integer to it's base type decimal.

[Issue 72: Effects of overflow and underflow unspecified]

Finally, consider some examples involving special IEEE 754 numerics.

  1. If either argument is "NaN", the result is "NaN".

  2. If neither argument is "NaN", but either argument is "INF", the result is "INF".

  3. If neither argument is "NaN" or "INF", but either argument is "-INF", the result is "-INF".

Note: In the case of multiplication and division, "INF" may become "-INF", and vice versa, as appropriate.

The functions op:numeric-add, op:numeric-subtract, op:numeric-multiply, op:numeric-divide and op:numeric-mod are each defined for three pairs of numeric operands of the same type: decimal, float and double. The functions op:numeric-unary-plus and op:numeric-unary-minus are defined for a single operand of the same three numeric types.

3.3.1 op:numeric-add

op:numeric-add(numeric $operand1, numeric $operand2) => numeric

Backs up the "+" operator and returns the arithmetic sum of its operands: ($operand1 + $operand2).

3.3.2 op:numeric-subtract

op:numeric-subtract(numeric $operand1, numeric $operand2) => numeric

Backs up the "-" operator and returns the arithmetic difference of its operands: ($operand1 - $operand2).

3.3.3 op:numeric-multiply

op:numeric-multiply(numeric $operand1, numeric $operand2) => numeric

Backs up the "*" operator and returns the arithmetic product of its operands: ($operand1 * $operand2).

3.3.4 op:numeric-divide

op:numeric-divide(numeric $operand1, numeric $operand2) => numeric

Backs up the "div" operator and returns the arithmetic quotient of its operands: ($operand1 div $operand2).

3.3.5 op:numeric-mod

op:numeric-mod(numeric $operand1, numeric $operand2) => numeric

Backs up the "mod" operator and returns the remainder after dividing the first operand by the second operand: ($operand1 mod $operand2). The result is of the same type as the operands after type promotion. The following rules apply:

  • If either operand is NaN, the result is NaN.

  • If the dividend is an infinity, or the divisor is a zero, or both, the result is NaN.

  • If not NaN, the sign of the result equals the sign of the dividend.

  • If the dividend is finite and the divisor is an infinity, the result equals the dividend.

  • If the dividend is a zero and the divisor is finite, the result is the same as the dividend.

  • In the remaining cases, where neither an infinity, nor a zero, nor NaN is involved, the float or double remainder r from a dividend n and a divisor d is defined by the mathematical relation r = n .(d * q) where q is an integer that is negative only if n/d is negative and positive only if n/d is positive, and whose magnitude is as large as possible without exceeding the magnitude of the true mathematical quotient of n and d. This is truncating division, analogous to integer division, not [IEEE 754-1985] rounding division.

Evaluation of a floating-point mod never throws a run-time exception, even if the right-hand operand is zero. Overflow, underflow, or loss of precision cannot occur.

3.3.5.1 Examples
  • op:numeric-mod(10,3) returns 1.

  • op:numeric-mod(6,2) returns 0.

  • op:numeric-mod(4.5,1.2) returns 0.9.

  • op:numeric-mod(1.23E2, 0.6E1) returns 3.0E0.

3.3.6 op:numeric-unary-plus

op:numeric-unary-plus(numeric $operand) => numeric

Backs up the unary "+" operator and returns its operand with the sign unchanged: (+ $operand). Semantically, this operation performs no operation.

3.3.7 op:numeric-unary-minus

op:numeric-unary-minus(numeric $operand) => numeric

Backs up the unary "-" operator and returns its operand with the sign reversed: (- $operand). If $operand is positive, its negative is returned; if it it is negative, its positive is returned.

3.4 Comparisons of Numeric Values

We define the following comparison operators on numeric values. Comparisons take two arguments of the same type. If the arguments are of different types, one argument is promoted to the type of the other. Each comparison operator returns a boolean value. If either, or both, operands are "NaN", false is returned.

[Issue 113: Need more complete numeric comparison semantics]

3.4.1 op:numeric-equal

op:numeric-equal(numeric $operand1, numeric $operand2) => boolean

Returns true if and only if $operand1 is exactly equal to $operand2. This function backs up the "eq" and "ne" operators on numeric values.

3.4.2 op:numeric-less-than

op:numeric-less-than(numeric $operand1, numeric $operand2) => boolean

Returns true if and only if $operand1 is less than $operand2. This function backs up the "lt" and "ge" operators on numeric values.

3.4.3 op:numeric-greater-than

op:numeric-greater-than(numeric $operand1, numeric $operand2) => boolean

Returns true if and only if $operand1 is greater than $operand2. This function backs up the "gt" and "le" operators on numeric values.

3.5 Functions on Numeric Values

The following functions are defined on these numeric types. Each function returns an integer except:

  • If the argument is the empty sequence, the empty sequence is returned.

  • If the argument is "NaN", "NaN" is returned.

[Issue 79: How many digits of precision (etc.) are returned from certain functions?]

[Issue 142: Should floor ceiling and round return the same type as their argument? ]

3.5.1 xf:floor

xf:floor(double? $srcval) => integer?

Returns the largest (closest to positive infinity) integer that is not greater than the value of $srcval. If the argument is the empty sequence, returns the empty sequence.

3.5.1.1 Examples
  • xf:floor(10.5) returns 10.

  • xf:floor(-10.5) returns -11.

3.5.2 xf:ceiling

xf:ceiling(double? $srcval) => integer?

Returns the smallest (closest to negative infinity) number that is not smaller than the value of $srcvaland that is an integer. If the argument is the empty sequnce, returns the empty sequence.

3.5.2.1 Examples
  • xf:ceiling(10.5) returns 11.

  • xf:ceiling(-10.5) returns -10.

3.5.3 xf:round

xf:round(double? $srcval) => integer?

Returns the number that is closest to the argument and that is an integer. More formally, round(x) produces the same result as floor(x+0.5). These semantics are consistent with Java's semantics. If there are two such numbers, then the one that is closest to positive infinity is returned. If the argument is NaN, then NaN is returned. If the argument is positive infinity, then positive infinity is returned. If the argument is negative infinity, then negative infinity is returned. If the argument is positive zero, then positive zero is returned. If the argument is negative zero, then negative zero is returned. If the argument is less than zero, but greater than or equal to -0.5, then negative zero is returned. If the argument is the empty sequnce, returns the empty sequence.

3.5.3.1 Examples
  • xf:round(2.5) returns 3.

  • xf:round(2.4999) returns 2.

  • xf:round(-2.5) returns -2 (not the possible alternative, -3).

4 Constructors and Functions on Strings

This section discusses functions and operators on the [XML Schema Part 2: Datatypes] string datatype and the datatypes derived from string.

4.1 String Types

The operators described in this section are defined on the following string types.

string
normalizedString
token
language
NMTOKEN
Name
NCName
ID
IDREF
ENTITY

They also apply to user-defined types derived by restriction from these types.

4.2 String Constructors

The following constructors are defined on string types. Each constructor takes a single string literal as argument.

ConstructorMeaning
xf:string Produces a string value by parsing and interpreting a supplied string.
xf:normalizedString Produces a normalizedString — the XML Schema datatype — value by parsing and interpreting a string
xf:token Produces a token value by parsing and interpreting a string.
xf:language Produces a language value by parsing and interpreting a string.
xf:Name Produces a Name value by parsing and interpreting a string.
xf:NMTOKEN Produces an NMTOKEN value by parsing and interpreting a string.
xf:NCName Produces an NCName value by parsing and interpreting a string.
xf:ID Produces an ID value by parsing and interpreting a string.
xf:IDREF Produces an IDREF value by parsing and interpreting a string.
xf:ENTITY Produces an ENTITY value by parsing and interpreting a string.

[Issue 14: Some function signatures are unclear about argument types]

4.2.1 xf:string

xf:string(string $srcval) => string

Returns a string value that is the value of $srcval. The more general function 2.3 xf:string returns the string value for several kinds of input arguments. If the input argument is a string it just returns the argument string. Thus, this constructor can be correctly perceived as a "no-op", but is included for the sake of orthogonality.

4.2.1.1 Examples
  • xf:string('abc') returns "abc".

  • If the context of an XML document, xf:string('Jéro&#x0302;me') returns "Jérôme". (The "&#x0302;" is the numeric code reference for the Unicode character U+0302, called "Combining Circumflex Accent". It is represented here with a numeric character reference that must be expanded by the XML parser; the constructors do not search for or replace numeric character references.)

    Note:

    The preceding semantic is correct if and only if this document requires the use of Unicode Normalization Form C (NFC) semantics for this constructor. [Character Model for the World Wide Web 1.0] requires normalization following certain operations, so it may be appropriate to mandate it here, too.

4.2.2 xf:normalizedString

xf:normalizedString(string $srcval) => normalizedString

Returns a normalizedString — the XML Schema datatype — value that is the value of $srcval. Every character contained in $srcval that is a line feed (#xA) is removed from the returned value.

If the argument string passed to a constructor is not a valid value in the lexical space of normalizedString as specified in [XML Schema Part 2: Datatypes], then the constructor returns the error value as defined in [XQuery 1.0 and XPath 2.0 Data Model]. Note that the argument to construct a normalizedString cannot contain the carriage return (#xD) or the tab (#x9) character.

4.2.2.1 Examples
  • xf:normalizedString('abc') returns "abc".

  • In the context of an XML document, xf:normalizedString('ab&#xA;cd') returns "abcd". (The "&#xA;" is a numeric character reference that must be expanded by the XML parser.)

  • In the context of an XML document, xf:normalizedString('ab&#xD;cd') returns the error value. (The "&#xD;" is a numeric character reference that must be expanded by the XML parser.)

4.2.3 xf:token

xf:token(string $srcval) => token

Returns a token value that is the value of $srcval. Note that the argument to construct a token must not contain the line feed (#xA) nor tab (#x9) characters, have no leading or trailing spaces (#x20), and must have no internal sequences of two or more spaces. If the argument string passed to a constructor results in an error, the constructor returns the error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].

4.2.4 xf:language

xf:language(string $srcval) => language

Returns a language value that is the value of $srcval. Note that the value of $srcval to construct a value of type language must be a valid language identifier as defined in the language identification section of [XML 1.0 Recommendation (Second Edition)]. If the argument string passed to a constructor results in an error (for example, xyx is not a valid language identifier), the constructor returns the error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].

4.2.5 xf:Name

xf:Name(string $srcval) => Name

Returns a Name value that is the value of $srcval. Note that the value of $srcval to construct a value of type Name must match the Name production of [XML 1.0 Recommendation (Second Edition)]. If the argument string passed to a constructor results in an error, the constructor returns the error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].

4.2.6 xf:NMTOKEN

xf:NMTOKEN(string $srcval) => NMTOKEN

Returns an NMTOKEN value that is the value of $srcval. Note that the value of $srcval to construct a value of type NMTOKEN must match the Nmtoken production of [XML 1.0 Recommendation (Second Edition)]. If the argument string passed to a constructor results in an error, the constructor returns the error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].

4.2.7 xf:NCName

xf:NCName(string $srcval) => NCName

Returns an NCName value that is the value of $srcval. Note that the value of $srcval to construct a value of type NCName must match the NCName production of [XML 1.0 Recommendation (Second Edition)]. If the argument string passed to a constructor results in an error, the constructor returns the error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].

4.2.8 xf:ID

xf:ID(string $srcval) => ID

Returns an ID value that is the value of $srcval. Note that the value of $srcval to construct a value of type ID must match the NCName production of [XML 1.0 Recommendation (Second Edition)]. If the argument string passed to a constructor results in an error, the constructor returns the error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].

The semantic correctness of ID values (that they must be unique within a given document) is not enforced by the xf:ID function.

4.2.9 xf:IDREF

xf:IDREF(string $srcval) => IDREF

Returns an IDREF value that is the value of $srcval. Note that the value of $srcval to construct a value of type IDREF must match the NCName production of [XML 1.0 Recommendation (Second Edition)]. If the argument string passed to a constructor results in an error, the constructor returns the error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].

The semantic correctness of IDREF values (that there must be a corresponding ID value in the same document) is not enforced by the xf:IDREF function.

4.2.10 xf:ENTITY

xf:ENTITY(string $srcval) => ENTITY

Returns an ENTITY value that is the value of $srcval. Note that the value of $srcval to construct a value of type ENTITY must match the NCName production of [XML 1.0 Recommendation (Second Edition)]. If the argument string passed to a constructor results in an error, the constructor returns the error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].

4.3 Equality and Comparison of Strings

The [Character Model for the World Wide Web 1.0] discusses the fact that strings from a particular character set may need to be collated (sorted) differently for different applications. Thus, the collation needs to be taken into account when comparing strings in any context. Several functions in this and the following section make use of a collation.

In this document, we assume that collations are named and the collation name, specified as a literal, is used as an argument to the comparison function. This document will also define the manner in which a default collation is determined, allowing the collation argument to be optional in the functions that allow it.

The value of the xml:lang attribute cannot be used to identify the default collation because such information is localized and may not be the same for both operands.

[Issue 44: Collations: URIs and URI references or short names?]

[Issue 70: How are "default" collations determined?]

Some collations can be "tailored" for various purposes. See [Unicode Collation Algorithm]. This document does not discuss tailoring. Instead, we assume that the collation argument to the various functions below is a tailored and named collation.

A specially named collation provides the ability to compare strings based on codepoint values.

A user who wishes to preserve the XPath 1.0 semantics of "<" and ">" can define a collation that converts each string to a number and then compares them numerically.

Collations can also indicate that some characters that are rendered differently are, in fact equal for collation purpose (e.g., "uve" and "uwe" are considered equivalent in some European languages). Thus, strings can be compared character-by-character or in a logical manner based on the collation.

The [Character Model for the World Wide Web 1.0] recommends that all strings be subjected to Unicode normalization early and, thus, string comparisons need only be defined on normalized strings. If this is not the case, then we may also want to compare unnormalized strings based on their normalized representations.

FunctionMeaningSource
xf:compare Compares two character strings; a collation may optionally be specifiedXSLT 2.0, Req. 2.13 (Could)

[Issue 73: Is a "between" function needed?]

4.3.1 xf:compare

xf:compare(string? $comparand1, string? $comparand2) => integer?
xf:compare(string? $comparand1, string? $comparand2, anyURI $collationLiteral) => integer?

Returns -1, 0, or 1, depending on whether the value of the $comparand1 is respectively less than, equal to, or greater than the value of $comparand2, according to the rules of the collation that is used.

If $collationLiteral is specified, then the value of $collationLiteral must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collationLiteral, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.

If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.

If the value of $comparand2 begins with a string that is equal to the value of $comparand1 (according to the collation that is used) and has additional characters following that beginning string, then the result is -1. If the value of $comparand1 begins with a string that is equal to the value of $comparand2 (according to the collation that is used) and has additional characters following that beginning string, then the result is 1.

If either argument is the empty sequence, the result is the empty sequence.

This function backs up the "eq", "ne', "gt", "lt", "le" and "ge" operators on string values.

[Issue 141: Does string equality use the codepoint collation or the default collation?]

4.3.1.1 Examples
  • xf:compare('abc', 'abc') returns 0.

  • xf:compare('Strasse', 'Straße') returns 0 if and only if the default collation includes provisions that equate "ss" and the (German) character "ß" ("sharp-s"). (Otherwise, the returned value depends on the semantics of the default collation.)

  • xf:compare('Strasse', 'Straße', anyURI('deutsch')) returns 0 if and only if the collation identified by the relative URI constructed from the string value "deutsch" includes provisions that equate "ss" and the (German) character "ß" ("sharp-s"). (Otherwise, the returned value depends on the semantics of that collation.)

  • xf:compare('Strassen', 'Straße') returns 1 if and only if the default collation includes provisions that equate "ss" and the (German) character "ß" ("sharp-s"). (Since the value of $comparand1 has an additional character, an "n", following the string that is equal to "Straße", it is greater than the value of $comparand2.)

4.4 Functions on String Values

The following functions are defined on these string types. Several of these function use a collation. See 4.3 Equality and Comparison of Strings for a discussion of collations.

FunctionMeaningSource
xf:concat Concatenates two or more character strings. XPath 1.0
xf:starts-with Indicates whether the value of one string begins with the characters of the value of another string. XPath 1.0
xf:ends-with Indicates whether the value of one string ends with the characters of the value of another string. XPath 1.0
xf:contains Indicates whether the value of one string contains the characters of the value of another string. A collation may optionally be specified. XPath 1.0
xf:substring Returns a string located at a specified place in the value of a string. XPath 1.0
xf:string-length Returns the length of the argument. XPath 1.0
xf:substring-before Returns the characters of one string that precede in that string the characters in the value of another string. A collation may optionally be specified. XPath 1.0
xf:substring-after Returns the characters of one string that precede in that string the characters in the value of another string. A collation may optionally be specified. XPath 1.0
xf:normalize-space Returns the whitespace-normalized value of the argument.XPath 1.0
xf:normalize-unicode Returns the normalized value of the first argument in the normalization form specified by the second argument.XPath 2.0 Req 2.9 (Should)
xf:upper-case Returns the upper-cased value of the argument. XPath 2.0 Req 2.4.3 (Should)
xf:lower-case Returns the lower-cased value of the argument. XPath 2.0 Req 2.4.3 (Should)
xf:translate Returns the first argument string with occurrences of characters in the second argument replaced by the character at the corresponding position in the third string. XPath 1.0
xf:string-pad Returns a string composed of as many copies of its first argument as specified in its second argument.XPath 2.0 Req 2.4.2, 4.4 (Should)
xf:match Returns a sequence of integers indicating the positions in the value of the first argument that are matched by the regular expression that is the value of the second argument. XPath 2.0 Req 3. (Must)
xf:replace Returns the first argument with every substring matched by the second argument replaced by the value of the third argument. XPath 2.0 Req 2.4.1. (Should)

[Issue 23: "Returns a copy" is not appropriate wording]

[Issue 21: What is the precise type returned by each function?]

[Issue 37: Linguistic contains required?]

[Issue 94: Must allow searching for words near other words. ]

[Issue 143: Should we add a tokenize function to break a string into tokens?]

4.4.1 Usage Notes

Note that the resulting string after operations such as concatenation or substring must be normalized. See [Character Model for the World Wide Web 1.0].

[Issue 108: Should strings always be returned in Unicode normalized form?]

Note also that when the above operators and functions are applied to datatypes derived from string, they are guaranteeed to return legal strings, but they may not return legal value for the particular subtype to which they were applied.

[Issue 20: Many uses of "character" should be "codepoint"]

4.4.2 xf:concat

xf:concat() => string
xf:concat(string? $op1) => string
xf:concat(string? $op1, string? $op2, ...) => string

Accepts zero or more strings as arguments. Returns the string that is the concatenation of the values of its arguments. The resulting string might not be normalized in any Unicode or W3C normalization. If called with no arguments, returns the zero-length string. If any of the arguments is the empty sequence it is treated as the zero-length string.

The concat() function is specified to allow an arbitrary number of string arguments that are concatenated together. This capability is retained for compatibility with [XPath 1.0] and is the only function specified in this document that has that characteristic.

[Issue 144: Should the concat function accept sequences as arguments? ]

4.4.2.1 Examples
  • xf:concat('abc', 'def') returns "abcdef".

  • xf:concat('abc') returns abc.

  • xf:concat('abc', 'def', 'ghi', 'jkl', 'mno') returns "abcdefghijklmno".

  • xf:concat(()) returns "".

4.4.3 xf:starts-with

xf:starts-with(string? $operand1, string? $operand2) => boolean?
xf:starts-with(string? $operand1, string? $operand2, anyURI $collationLiteral) => boolean?

Returns a boolean indicating whether or not the value of $operand1 starts with a string that is equal to the value of $operand2 according to the collation that is used.

If the value of $operand2 is the zero-length string, then the function returns true. If the value of $operand1 is the zero-length string and the value of $operand2 is not the zero-length string, then the function returns false.

If the value of $operand1 or $operand2 is the empty sequence, the empty sequence is returned.

If $collationLiteral is specified, then the value of $collationLiteral must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collationLiteral, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.

If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.

4.4.3.1 Examples
  • xf:starts-with("goldenrod", "gold") returns true.

  • xf:starts-with("goldenrod", "") returns true.

  • xf:starts-with("goldenrod", "rod") returns false.

4.4.4 xf:ends-with

xf:ends-with(string? $operand1, string? $operand2) => boolean?
xf:ends-with(string? $operand1, string? $operand2, anyURI $collationLiteral) => boolean?

Returns a boolean indicating whether or not the value of $operand1 ends with a string that is equal to the value of $operand2 according to the specified collation.

If the value of $operand2 is the zero-length string, then the function returns true. If the value of $operand1 is the zero-length string and the value of $operand2 is not the zero-length string, then the function returns false.

If the value of $operand1 or $operand2 is the empty sequence, the empty sequence is returned.

If $collationLiteral is specified, then the value of $collationLiteral must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collationLiteral, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.

If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.

4.4.4.1 Examples
  • xf:ends-with("goldenrod","rod") returns true.

  • xf:ends-with("", "rod") returns false.

4.4.5 xf:contains

xf:contains(string? $operand1, string? $operand2) => boolean?
xf:contains(string? $operand1, string? $operand2, anyURI $collationLiteral) => boolean?

Returns a boolean indicating whether or not the value of $operand1 contains (at the beginning, at the end, or anywhere within) a string equal to the value of $operand1 according to the collation that is used.

If the value of $operand2 is the zero-length string, then the function returns true. If the value of $operand1 is the zero-length string and the value of $operand2 is not the zero-length string, then the function returns false.

If the value of $operand1 or $operand2 is the empty sequence, the empty sequence is returned.

If $collationLiteral is specified, then the value of $collationLiteral must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collationLiteral, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.

If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.

4.4.6 xf:substring

xf:substring(string? $sourceString, decimal? $startingLoc) => string?
xf:substring(string? $sourceString, decimal? $startingLoc, decimal? $length) => string?

Returns the portion of the value of $sourceString beginning at the position indicated by the value of $startingLoc and continuing for the number of characters indicated by the value of $length. More specifically, returns the characters in $sourceString whose position $p obeys:

xf:round($startingLoc) <= $p < xf:round($startingLoc + $length)

If $length is not specified, the substring identifies characters to the end of $sourceString.

If $length is greater than the number of characters in the value of $sourceString following $startingLoc, the substring identifies characters to the end of $sourceString.

The first character of a string is located at position 1 (not position 0).

If the value of $startingLoc is negative or greater than the length of $sourceString an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model] is returned.

If the value of any of the three parameters is the empty sequence, the empty sequence is returned.

4.4.6.1 Examples
  • xf:substring("motor car", 6) returns " car".

  • xf:substring("metadata", 4, 3) returns "ada".

4.4.7 xf:string-length

xf:string-length(string? $srcval) => integer?

Returns an integer equal to the length in characters of the value of $srcval. If the value of $srcval is the empty sequence, the empty sequence is returned.

4.4.7.1 Examples
  • xf:string-length("first we kill the lawyers") returns 25.

4.4.8 xf:substring-before

xf:substring-before(string? $operand1, string? $operand2) => string?
xf:substring-before(string? $operand1, string? $operand2, anyURI $collationLiteral) => string?

Returns the substring of the value of $operand1 that precedes in the value of $operand1 the fir