RIF Data Types and Built-Ins

W3C Editor's Draft 18 May 2008

This version:: http://www.w3.org/2005/rules/wg/draft/ED-rif-dtb-20080518/
Latest editor's draft:: http://www.w3.org/2005/rules/wg/draft/rif-dtb/
Previous version:: http://www.w3.org/2005/rules/wg/draft/ED-rif-dtb-20080222/ (color-coded diff)

Editors:: Axel Polleres, DERI; Harold Boley, National Research Council Canada; Michael Kifer, State University of New York at Stony Brook

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document is being published as one of a set of 5 documents:

Please Comment By 2008-05-25

The Rule Interchange Format (RIF) Working Group seeks public feedback on these Working Drafts. Please send your comments to public-rif-comments@w3.org (public archive). If possible, please offer specific changes to the text that would address your concern. You may also wish to check the Wiki Version of this document for internal-review comments and changes being drafted which may address your concerns.

No Endorsement

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Patents

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document, developed by the Rule Interchange Format (RIF) Working Group, specifies a list of primitive datatypes, built-in functions and built-in predicates supported by the RIF Basic Logic Dialect. Some of the RIF supported datatypes are adopted from XML Schema Part 2: Datatypes Second Edition. A large part of the definitions of the listed functions and operators are adapted from XQuery 1.0 and XPath 2.0 Functions and Operators.

1 Naming and notational conventions used in this document

Throughout this document we use the following prefixes for symbol spaces:

the xsd: prefix stands for the XML Schema namespace URI http://www.w3.org/2001/XMLSchema#,
the rdf: prefix stands for http://www.w3.org/1999/02/22-rdf-syntax-ns#,
the rif: prefix stands for http://www.w3.org/2007/rif#,

Syntax such as xsd:string should be understood as a compact URI [CURIE] -- a macro that expands to a concatenation of the character sequence denoted by the prefix xsd and the string string. The compact URI notation is not part of the RIF syntax, but rather just a space-saving device employed in this document.

2 Symbol Spaces and Data Types

2.1 Symbol Spaces

This section reproduces the definition of symbol spaces, which are part of the syntactic and semantic framework for RIF's logic-based dialects.

Definition (Symbol space). A symbol space is a named subset of the set of all constants, Const in RIF. They are part of the logical RIF framework (RIF-FLD). Each symbol in Const belongs to exactly one symbol space.

Each symbol space has an associated lexical space, a unique identifier, and, possibly, one or more aliases. More precisely,

The lexical space of a symbol space is a non-empty set of Unicode character strings.
The identifier of a symbol space is a sequence of Unicode characters that form an absolute IRI.
An alias is also a sequence of unicode characters, but it is not required to form an IRI.
Different symbol spaces cannot share the same identifier or an alias.

The identifiers and aliases of symbol spaces are not themselves constant symbols in RIF. ☐

To simplify the language, we will often use symbol space identifiers to refer to the actual symbol spaces (for instance, we may use "symbol space xsd:string" instead of "symbol space identified by xsd:string").

To refer to a constant in a particular RIF symbol space, we use the following presentation syntax:

     "literal"^^symspace

where literal is called the lexical part of the symbol, and symspace is an identifier or an alias of the symbol space. Here literal is a sequence of Unicode characters that must be an element in the lexical space of the symbol space symspace. For instance, "1.2"^^xsd:decimal and "1"^^xsd:decimal are legal symbols because 1.2 and 1 are members of the lexical space of the XML Schema data type xsd:decimal. On the other hand, "a+2"^^xsd:decimal is not a legal symbol, since a+2 is not part of the lexical space of xsd:decimal.

The set of all symbol spaces that partition Const is considered to be part of the logic language of RIF-FLD.

RIF requires that all dialects include the following symbol spaces. Rule sets that are exchanged through RIF can use additional symbol spaces.

xsd:string (http://www.w3.org/2001/XMLSchema#string)

and all the symbol spaces that correspond to the subtypes of xsd:string as specified in [XML-SCHEMA2].

xsd:decimal (http://www.w3.org/2001/XMLSchema#decimal)

and all the symbol spaces that corresponds to the subtypes of xsd:decimal as specified in [XML-SCHEMA2].

xsd:time (http://www.w3.org/2001/XMLSchema#time).
xsd:date (http://www.w3.org/2001/XMLSchema#date).
xsd:dateTime http://www.w3.org/2001/XMLSchema#dateTime).

The lexical spaces of the above symbol spaces are defined in the document [XML-SCHEMA2].

rdf:XMLLiteral (http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral).

This symbol space represents XML content. The lexical space of rdf:XMLLiteral is defined in the document [RDF-CONCEPTS].

rif:text (for text strings with language tags attached).

This symbol space represents text strings with a language tag attached. The lexical space of rif:text is the set of all Unicode strings of the form ...@LANG, i.e., strings that end with @LANG where LANG is a language identifier as defined in [RFC-3066].

rif:iri (for internationalized resource identifiers or IRIs).

Constant symbols that belong to this symbol space are intended to be used in a way similar to RDF resources [RDF-SCHEMA]. The lexical space consists of all absolute IRIs as specified in [RFC-3987]; it is unrelated to the XML primitive type anyURI. A rif:iri constant must be interpreted as a reference to one and the same object regardless of the context in which that constant occurs.

rif:local (for constant symbols that are not visible outside of the RIF document in which they occur).

Symbols in this symbol space are local to the RIF documents in which they occur. This means that occurrences of the same rif:local constant in different documents are viewed as unrelated distinct constants, but occurrences of the same rif:local constant in the same document must refer to the same object. The lexical space of rif:local is the same as the lexical space of xsd:string.

2.2 Data Types

Data types in RIF are symbol spaces which have special semantics. That is, each datatype is characterized by a fixed lexical space, value space and lexical-to-value-mapping.

Definition (Primitive data type). A primitive data type (or just a data type, for short) is a symbol space that has

an associated set, called the value space, and
a mapping from the lexical space of the symbol space to the value space, called lexical-to-value-space mapping. ☐

Semantic structures are always defined with respect to a particular set of data types, denoted by DTS. In a concrete dialect, DTS always includes the data types supported by that dialect. All RIF dialects must support the following primitive data types:

xsd:long
xsd:integer
xsd:decimal
xsd:string
xsd:time
xsd:dateTime
rdf:XMLLiteral
rif:text

Their value spaces and the lexical-to-value-space mappings are defined as follows:

For the XML Schema data types of RIF, namely xsd:long, xsd:integer, xsd:decimal, xsd:string, xsd:time, and xsd:dateTime, the value spaces and the lexical-to-value-space mappings are defined in the XML Schema specification [XML-SCHEMA2].
The value space and lexical-to-value-space mapping for the primitive data type rdf:XMLLiteral is defined in RDF [RDF-CONCEPTS].
The value space rif:text is the set of all pairs of the form (string, lang), where string is a Unicode character sequence and lang is a lowercase Unicode character sequence which is a natural language identifier as defined by [RFC-3066]. The lexical-to-value-space mapping of rif:text, denoted L_rif:text, maps each symbol string@lang in the lexical space of rif:text to (string, lower-case(lang)), where lower-case(lang) is lang written in all-lowercase letters.

The value space and the lexical-to-value-space mapping for rif:text defined here are compatible with RDF's semantics for strings with named tags [RDF-SEMANTICS].

3 Syntax and Semantics of Built-ins

3.1 Syntax of Built-ins

A RIF built-in function or predicate is a special case of externally defined terms, which are defined in RIF Framework for Logic Dialects and also reproduced in the direct definition of RIF Basic Logic Dialect (RIF-BLD).

Syntactically, built-in predicates and functions in RIF are represented as external terms of the form:

  'External' '(' Expr ')'

where Expr is a positional term as defined in in RIF Framework for Logic Dialects (see also in RIF Basic Logic Dialect). For RIF's normative syntax, see XML Serialization Syntax for RIF-BLD.

RIF-FLD introduces the notion of an external schema to describe both the well-formed externally defined terms and their semantics. In the special case of a RIF built-in, external schemas have especially simple form. A built-in named f that takes n arguments has the schema

  (?X₁ ... ?X_n;   f(?X₁ ... ?X_n)

Here f(?X₁ ... ?X_n) is the actual term that is used to refer to the built-in (in expressions of the form External(f(?X₁ ... ?X_n))) and ?X₁ ... ?X_n is the list of all variables in that term.

For convenience, a complete definition of external schemas is reproduced in Appendix: Schemas for Externally Defined Terms.

3.2 Semantics of Built-ins

The semantics of external terms in RIF-FLD and RIF-BLD is defined using two mappings: I_external and I_truth ο I_external.

I_external. This mapping takes an external schema, σ, and returns a mapping, I_external(σ).
If σ represents a built-in function, I_external(σ) must be that function.

For each built-in function with external schema σ, the present document specifies the mapping I_external(σ).
I_truth. This mapping takes an element of the domain of interpretation and returns a truth value.
In RIF logical semantics, this mapping is used to assign truth values to formulas. In the special case of RIF built-ins, it is used to assign truth values to RIF built-in predicates. The built-in predicates can have the truth values t or f only.

For a built-in predicate with schema σ, RIF-FLD and RIF-BLD require that the truth-valued mapping I_truth ο I_external(σ) must agree with the specification of the corresponding built-in predicate.

For each RIF built-in predicate with schema σ, the present document specifies I_truth ο I_external(σ).

4 List of Supported Built-in Predicates and Functions

This section defines the syntax and semantics of all built-in predicates and functions in RIF. For each built-in, the following is defined:

The name of the built-in.

The external schema of the built-in.

For a built-in function, how it maps its arguments into a result.
As explained in Section Semantics of Built-ins, this corresponds to the mapping I_external(σ) in the formal semantics of RIF-FLD and RIF-BLD, where σ is the external schema of the built-in.
For a built-in predicate, its truth value when the arguments are substituted with values in the domain.
As explained in Section Semantics of Built-ins, this corresponds to the mapping I_truth ο I_external(σ) in the formal semantics of RIF-FLD and RIF-BLD, where σ is the external schema of the built-in.
The intended domains for the arguments of the built-in.
Typically, built-in functions and predicates are defined over the value spaces of appropriate data types. These are the intended domains of the arguments. When an argument falls outside of its intended domain, it is understood as an error. Since this document defines a model-theoretic semantics for RIF built-ins, which does not support the notion of an error, the definitions leave the values of the built-in predicates and functions unspecified in such cases. This means that for different semantic structures, the value of I_external(σ)(a₁ ... a_n) can be anything if one of the arguments is not in its intended domain. Similarly, I_truth ο I_external(σ)(a₁ ... a_n) can be t in some interpretations and f in others.

This indeterminacy in case of an error implies that applications must not make any assumptions about the values of built-ins in such situations. Implementations are even allowed to abort in such cases and the only safe way to communicate rule sets that contain built-ins among RIF-compliant systems is to use data type guards.

Many built-in functions and predicates described below are adapted from XQuery 1.0 and XPath 2.0 Functions and Operators and, when appropriate, we will refer to the definitions in that specification in order to avoid copying them.

4.1 Guard Predicates for Datatypes

RIF requires guard predicates for all its supported datatypes. The schemas for these predicates have the general form

( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isDATATYPE"^^rif:iri ( ?arg₁ ) )

with I_external( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isDATATYPE"^^rif:iri ( ?arg₁ ) )(s₁) = t if and only if s₁ is in the value space of DATATYPE and f otherwise. Accordingly, the following schemas are defined:

 ( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isLong"^^rif:iri ( ?arg₁ ) )

 ( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isInteger"^^rif:iri( ?arg₁ ) )

 ( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isDecimal"^^rif:iri ( ?arg₁ ) )

 ( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isString"^^rif:iri ( ?arg₁ ) )

 ( ?arg₁;  "http://www.w3.org/2007/rif-builtin-predicates#isTime^^rif:iri ( ?arg₁ ) )

 ( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isDateTime"^^rif:iri ( ?arg₁ ) )

 ( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isXMLLiteral"^^rif:iri ( ?arg₁ ) )

 ( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isText"^^rif:iri ( ?arg₁ ) )

Likewise, RIF has negative guards for all its supported datatypes. These built-in predicates have the schema

( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isNotDATATYPE"^^rif:iri ( ?arg₁ ) )

with I_external( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isNotDATATYPE"^^rif:iri ( ?arg₁ ) )(s₁) = f if and only if s₁ is in the value space of DATATYPE and t otherwise. Accordingly, the following schemas are defined:

 ( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isNotLong"^^rif:iri ( ?arg₁ ) )

 ( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isNotInteger"^^rif:iri ( ?arg₁ ) )

 ( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isNotDecimal"^^rif:iri ( ?arg₁ ) )

 ( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isNotString"^^rif:iri ( ?arg₁ ) )

 ( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isNotTime"^^rif:iri ( ?arg₁ ) )

 ( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isNotDateTime"^^rif:iri ( ?arg₁ ) )

 ( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isNotXMLLiteral"^^rif:iri ( ?arg₁ ) )

 ( ?arg₁; "http://www.w3.org/2007/rif-builtin-predicates#isNotText"^^rif:iri ( ?arg₁ ) )

Future dialects may extend this list of guards to other datatypes, but RIF does not require guards for all datatypes.

4.2 Cast functions for Datatypes and rif:iri

RIF requires cast functions for all its supported datatypes, i.e.

( ?arg₁; "DATATYPE"^^rif:iri ( ?arg₁ ) )

where DATATYPE is one of the primitive data types defined in Section []

I_external( ?arg₁; "DATATYPE"^^rif:iri ( ?arg₁ ) )(s₁) = s₁'

such that s₁' is the conversion of s₁ to the value space of xsd:DATATYPE if and only if conversion from s₁ is possible according to the table in Section 17.1 of XQuery 1.0 and XPath 2.0 Functions and Operators, and err:XPTY0004 otherwise.

Partial conversion functions between the datatypes xsd:time, xsd:string, xsd:dateTime are defined as in the conversion table in Section 17.1 of XQuery 1.0 and XPath 2.0 Functions and Operators.

Note that neither the three remaining data types xsd:long, rif:XMLLiteral, and err:, nor the symbl space rif:iri do not appear in that table, but the following considerations apply:

The conversions from and to xsd:long follow the same considerations as xsd:integer in that table, by the XPath, XQuery type hierarchy in Section 1.6 of XQuery 1.0 and XPath 2.0 Functions and Operators.
Any rif:XMLLiteral can be cast to xsd:string by preserving the lexical value and just changing the symbol space. An xsd:string can be cast to rdf:XMLLiteral if and only if its value is in the lexical space of rdf:XMLLiteral as defined in Resource Description Framework (RDF): Concepts and Abstract Syntax.
Any rif:text can be cast to xsd:string by preserving the lexical value of its string part.
Any err: can be cast to xsd:string by preserving its lexical value.
Additionally we allow conversions from and to rif:iri following the same considerations as conversions from and to xsd:anyURI.

Although rif:iri is not a datatype, I left conversions from and to rif:iris in the list of cast functions, since I see some use cases in the context of RDF if you want to extract a string from an IRI and vice versa. See also the use case mentioned in http://lists.w3.org/Archives/Public/public-rif-wg/2008Mar/0011.html

4.3 Numeric Functions and Predicates

4.3.1 Numeric Functions

1. op:numeric-add

Schema:
(?arg₁ ?arg₂; func:numeric-add(?arg₁ ?arg₂))
Intended domains:
The value spaces of xsd:integer, xsd:long, or xsd:decimal for both arguments.
Mapping:
When both s₁ and s₂ belong to their intended domains, External(func:numeric-add(s₁ s₂)) evaluates to the result of op:numeric-add(s₁, s₂) as defined in XQuery 1.0 and XPath 2.0 Functions and Operators.

If an argument value is outside of the intended domain, the value of the function is left unspecified and can vary from one semantic structure to another.

2. op:numeric-subtract

(?arg₁ ?arg₂; func:numeric-subtract( ?arg₁ ?arg₂) )

3. op:numeric-multiply

(?arg₁ ?arg₂; func:numeric-multiply( ?arg₁ ?arg₂) )

4. op:numeric-divide

(?arg₁ ?arg₂; func:numeric-divide( ?arg₁ ?arg₂) )

4.3.2 Numeric Predicates

1. op:numeric-equal

Schema:
(?arg₁ ?arg₂; pred:numeric-equal(?arg₁ ?arg₂))
Intended domains:
The value spaces of xsd:integer, xsd:long, or xsd:decimal for both arguments.
Mapping:
When both s₁ and s₂ belong to their intended domains, External(pred:numeric-equal(s₁ s₂)) is t if and only if op:numeric-equal(s₁, s₂) returns true, as defined in XQuery 1.0 and XPath 2.0 Functions and Operators.

If an argument value is outside of the intended domain, the truth value of the function is left unspecified and can vary from one semantic structure to another.

2. op:numeric-less-than

(?arg₁ ?arg₂; pred:numeric-less-than( ?arg₁ ?arg₂) )

3. op:numeric-greater-than

(?arg₁ ?arg₂; pred:numeric-greater-than( ?arg₁ ?arg₂) )

4.4 Functions and Predicates on Strings

Issue: In the following, we often encounter several versions of some built-ins. Since XPath and Xquery allow overloading, i.e. the same function or operator name occurring with different arities. We suggest to treat this likewise in RIF, by numbering the different versions of the respective built-ins and treating the unnumbered version as syntactic sugar, i.e. for instance instead of External( func:concat2( str₁, str₂) ) and </tt>External( func:concat3( str₁ str₂ str₃ ) )</tt> we equally allow simply to write External( func:concat( str₁, str₂) ) and </tt>External( func:concat( str₁ str₂ str₃ ) )</tt>. Note that this is really purely syntactic sugar, and does not mean that for external predicates and functions we lift restriction that each function and predicate has a unique assigned arity. Those schemata for which we allow this syntactic sugar, appear in the same box.

4.4.1 Functions on Strings

1. fn:compare

( ?comparand₁ ?comparand₂; func:compare1(?comparand₁ ?comparand₂) )
( ?comparand₁ $comparand₂ $collation; func:compare2(?comparand₁ ?comparand₂ ?collation) )

where I_external( ( ?comparand₁ ?comparand₂; func:compare1(?comparand₁ ?comparand₂) )(s₁ s₂) = res

such that res = -1, 0, or 1, depending on whether the value of the s₁ is respectively less than, equal to, or greater than the value of s₂, according to the rules of the collation that is used.

2. fn:concat

( ?arg₁; func:concat1(₁ ) )
( ?arg₁ ?arg₂; func:concat2(?arg₁ ?arg₂ ) )
...
( ?arg₁  ?arg₂ ... ?arg_n; func:concatn(?arg₁ ?arg₂ ... ?arg_n ) )

Accepts xs:anyAtomicType arguments and casts them to xsd:string. Returns the xsd:string that is the concatenation of the values of its arguments after conversion. Only defined if all arguments are castable to strings (see Section on Cast functions above), otherwise returns an error as defined for fn:concat.

2. fn:string-join

( ?arg₁ ?arg₂; func:string-join2(?arg₁ ?arg₂ ) )
( ?arg₁ ?arg₂ ?arg₃; func:string-join2(?arg₁ ?arg₂ ?arg₃ ) )
...
( ?arg₁  ?arg₂ ... ?arg_n; func:string-joinn(?arg₁ ?arg₂ ... ?arg_n ) )

Returns a xsd:string created by concatenating the arguments 1 to (n-1) using the n^th argument as a separator. Only defined if all arguments are strings, otherwise returns an error as defined for fn:string-join.

3. fn:substring

( ?sourceString� ?startingLoc; fn:substring1( ?sourceString� ?startingLoc) )
( fn:substring2( ?sourceString ?startingLoc ?length) )

Returns the portion of the value of ?sourceString beginning at the position indicated by the value of ?startingLoc and continuing for the number of characters indicated by the value of ?length. The characters returned do not extend beyond ?sourceString. If ?startingLoc is zero or negative, only those characters in positions greater than zero are returned.

4. fn:string-length

( func:string-length1() )
( ?arg ; func:string-length2( ?arg ) )

Returns an xsd:integer equal to the length in characters of the argument if it is a xsd:string.

5. fn:upper-case

( ?arg ; func:upper-case( ?arg ) )

6. fn:lower-case

( ?arg ; func:lower-case( ?arg ) )

7. fn:encode-for-uri

( ?arg ; func:encode-for-uri( ?arg ) )

This function encodes reserved characters in an xs:string that is intended to be used in the path segment of a URI. It is invertible but not idempotent.

8. fn:iri-to-uri

( ?iri ; func:iri-to-uri ( ?iri ) )

9. fn:escape-html-uri

( ?uri ;func:escape-html-uri( ?uri ) )

This function escapes all characters except printable characters of the US-ASCII coded character set, specifically the octets ranging from 32 to 126 (decimal). The effect of the function is to escape a URI in the manner html user agents handle attribute values that expect URIs. Each character in $uri to be escaped is replaced by an escape sequence, which is formed by encoding the character as a sequence of octets in UTF-8, and then representing each of these octets in the form %HH, where HH is the hexadecimal representation of the octet. This function must always generate hexadecimal values using the upper-case letters A-F.

10. fn:substring-before

( ?arg₁ ?arg₂; func:substring-before1( ?arg₁ ?arg₂ ) )
( ?arg₁ ?arg₂ ?collation; func:substring-before2( ?arg₁ ?arg₂ ?collation ) )

11. fn:substring-after

( ?arg₁ ?arg₂; func:substring-after1( ?arg₁ ?arg₂ ) )
( ?arg₁ ?arg₂ ?collation; func:substring-after2( ?arg₁ ?arg₂ ?collation ) )

12. fn:replace

( ?input ?pattern ?replacement; func:replace1( ?input ?pattern ?replacement ) )
( ?input ?pattern ?replacement ?flags; func:replace2( ?input ?pattern ?replacement ?flags ) )

Editor's note: I removed fn:tokenize from previous versions, since tokenizer returns a sequence of strings, we might reconsider this if we (re-)introduce lists, but I left this out for the moment.

4.4.2 Predicates on Strings

The predicates described here examine the first string argument to see whether it contains the second string argument as a substring.

1. fn:contains

( ?arg₁ ?arg₂; pred:contains1( ?arg₁ ?arg₂ ) ) 
( ?arg₁ ?arg₂ ?collation ; pred:contains2( ?arg₁ ?arg₂ ?collation ) )

Returns true or false indicating whether or not the value of $arg1 contains (at the beginning, at the end, or anywhere within) at least one sequence of collation units that provides a minimal match to the collation units in the value of $arg2, according to the collation that is used. "Minimal match" is defined in Unicode Collation Algorithm.

Shall we return an error or false on typing errors?

2. fn:starts-with

( ?arg₁ ?arg₂; pred:starts-with1( ?arg₁ ?arg₂ )
( ?arg₁ ?arg₂ ?collation; pred:starts-with2( ?arg₁ ?arg₂ ?collation)

Shall we return an error or false on typing errors?

3. fn:ends-with

(?arg₁ ?arg₂; fn:ends-with1( ?arg₁ ?arg₂ ) )
(?arg₁ ?arg₂ ?collation; fn:ends-with2( ?arg₁ ?arg₂ ?collation) )

Shall we return an error or false on typing errors?

4. fn:matches

( ?input ?pattern; pred:matches1( ?input ?pattern) )
( ?input ?pattern ?flags; pred:matches2( ?input ?pattern ?flags ) )

The function returns true if the input matches the regular expression supplied as pattern as influenced by the flags, if present; otherwise, it returns false.

The effect of calling the first version of this function (omitting the flags) is the same as the effect of calling the second version with the flags argument set to a zero-length string.

Shall we return an error or false on typing errors?

4.5 Functions and Predicates on Dates and Times

If not stated otherwise, in the following we define schemas for functions and operators defined on the date and time datatypes in XML Schema Part 2: Datatypes Second Edition.

As defined in Section 3.3.2 Dates and Times, xsd:dateTime, xsd:date, xsd:time, xsd:gYearMonth, xsd:gYear, xsd:gMonthDay, xsd:gMonth, xsd:gDay values, referred to collectively as date/time values, are represented as seven components or properties: year, month, day, hour, minute, second and timezone. The value of the first five components are xsd:integers. The value of the second component is an xsd:decimal and the value of the timezone component is an xsd:dayTimeDuration. For all the date/time datatypes, the timezone property is optional and may or may not be present. Depending on the datatype, some of the remaining six properties must be present and some must be absent. Absent, or missing, properties are represented by the empty sequence. This value is referred to as the local value in that the value is in the given timezone. Before comparing or subtracting xsd:dateTime values, this local value must be translated or normalized to UTC.

4.5.1 Predicates on Date and Time Values

1. op:dateTime-equal

( ?arg₁ ?arg₂; pred:dateTime-equal( ?arg₁ ?arg₂) )

where I_external( ?arg₁ ?arg₂; pred:dateTime-equal( ?arg₁ ?arg₂ ) )(s₁ s₂) = t

if and only if op:dateTime-equal(s₁, s₂) returns true, as defined in XQuery 1.0 and XPath 2.0 Functions and Operators, f otherwise. The following schemata are defined analogously with respect to their corresponding operators as defined in XQuery 1.0 and XPath 2.0 Functions and Operators.

2. op:dateTime-less-than

( ?arg₁ ?arg₂; pred:dateTime-less-than(?arg₁ ?arg₂ ) )

3. op:dateTime-greater-than

( ?arg₁ ?arg₂; pred:dateTime-greater-than(?arg₁ ?arg₂ ) )

4. op:date-equal

( ?arg₁ ?arg₂; pred:date-equal( ?arg₁ ?arg₂) )

5. op:date-less-than

( ?arg₁ ?arg₂; pred:date-less-than(?arg₁ ?arg₂ ) )

6. op:date-greater-than

( ?arg₁ ?arg₂; pred:date-greater-than(?arg₁ ?arg₂ ) )

7. op:time-equal

( ?arg₁ ?arg₂; pred:time-equal( ?arg₁ ?arg₂) )

8. op:time-less-than

( ?arg₁ ?arg₂; pred:time-less-than(?arg₁ ?arg₂ ) )

9. op:time-greater-than

( ?arg₁ ?arg₂; pred:time-greater-than(?arg₁ ?arg₂ ) )

4.5.2 Functions on Dates and Times

1. fn:year-from-dateTime

( ?arg ; func:year-from-dateTime( ?arg ) )

where I_external( ?arg ; func:year-from-dateTime( ?arg ) )(s) = res

such that res is the result of fn:year-from-dateTime(s) as defined in XQuery 1.0 and XPath 2.0 Functions and Operators if s is in the value space of xsd:dateTime, otherwise returns an error as defined in XQuery 1.0 and XPath 2.0 Functions and Operators. The following schemata are defined analogously with respect to their corresponding functions and possible error codes as defined in XQuery 1.0 and XPath 2.0 Functions and Operators.

Editor's remark: Where in XQuery 1.0 and XPath 2.0 Functions and Operators are the errors for these functions defined?

We slightly deviate here from the original definition of fn:year-from-dateTime which says: "If ?arg is the empty sequence, returns the empty sequence." We have no terminology of "sequence".

Editor's remark: I am not sure whether this is a problem. Do we have to deal with empty sequences?

2. fn:month-from-dateTime

( ?arg ; func:month-from-dateTime( ?arg ) )

3. fn:day-from-dateTime

( ?arg ; func:day-from-dateTime( ?arg ) )

4. fn:hours-from-dateTime

( ?arg ; func:hours-from-dateTime( ?arg ) )

5. fn:minutes-from-dateTime

( ?arg ; func:minutes-from-dateTime( ?arg  ) )

6. fn:seconds-from-dateTime

( ?arg ; func:seconds-from-dateTime( ?arg ) )

7. fn:year-from-date

( ?arg ; func:year-from-date( ?arg ) )

8. fn:month-from-date

( ?arg ; func:month-from-date( ?arg ) )

9. fn:day-from-date

( ?arg ; func:day-from-date( ?arg ) )

10. fn:hours-from-time

( ?arg ; func:hours-from-time( ?arg ) )

11. fn:minutes-from-time

( ?arg ; func:minutes-from-time( ?arg ) )

12. fn:seconds-from-time

( ?arg ; func:seconds-from-time( ?arg ) )

4.6 Functions and Predicates on rif:XMLLiterals

We support (though not impose) an XPath built-in function, applied to XMLLiterals as follows.

(?xml ?xquery; func:evalXQuery( ?xml ?xquery) )

where I_external(?xml ?xquery; func:evalXQuery( ?xml ?xquery) )(s₁ s₂) = res such that, if s₁ is in the value space of rdf:XMLLiteral and s₂ is in the value space of xsd:string and s₂ is a syntactically valid XQuery expression, then res is the result of the XQuery s₂ on the context item being the root node of the XML document represented by s₁, and otherwise if no other error as defined in XQuery applies, the error err:XQTY0028.

Editor's remark: The error code err:XQTY0028 is currently unused.

4.7 Functions and Predicates on rif:text

(?arg ; func:lang( ?arg ) )

where I_external(?arg ; func:lang( ?arg ) )(s) = res such that res is the language tag string of s, if s is in the value space of rif:text and ""^^xsd:string otherwise.

5 References

[RDF-CONCEPTS]

[RDF-SEMANTICS]: RDF Semantics, Patrick Hayes, Editor, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-rdf-mt-20040210/. Latest version available at http://www.w3.org/TR/rdf-mt/.

[RDF-SCHEMA]: RDF Vocabulary Description Language 1.0: RDF Schema, Brian McBride, Editor, W3C Recommendation 10 February 2004, http://www.w3.org/TR/rdf-schema/.

[RFC-3066]: RFC 3066 - Tags for the Identification of Languages, H. Alvestrand, IETF, January 2001. This document is at http://www.isi.edu/in-notes/rfc3066.txt.

[RFC-3987]: RFC 3987 - Internationalized Resource Identifiers (IRIs), M. Duerst and M. Suignard, IETF, January 2005. This document is at http://www.ietf.org/rfc/rfc3987.txt.

[XML-SCHEMA2]: XML Schema Part 2: Datatypes, W3C Recommendation, World Wide Web Consortium, 2 May 2001. This version is http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/. The latest version is available at http://www.w3.org/TR/xmlschema-2/.

6 Appendix: Schemas for Externally Defined Terms

This section is an edited copy of a section from RIF Framework for Logic Dialects. It is reproduced here for convenience of the reader of the RIF-BLD document who might not be familiar with RIF-FLD.

This section defines external schemas, which serve as templates for externally defined terms. These schemas determine which externally defined terms are acceptable in a RIF dialect. Externally defined terms include RIF built-ins, but are more general. They are designed to also accommodate the ideas of procedural attachments and querying of external data sources. Because of the need to accommodate many difference possibilities, the RIF logical framework supports a very general notion of an externally defined term. Such a term is not necessarily a function or a predicate -- it can be a frame, a classification term, and so on.

Definition (Schema for external term). An external schema is a statement of the form (?X₁ ... ?X_n; τ) where

τ is a term with the exception that it is not permitted to be a variable or an externally defined term.
?X₁ ... ?X_n is a list of all distinct variables that occur in τ

The names of the variables in an external schema are immaterial, but their order is important. For instance, (?X ?Y; ?X[foo->?Y]) and (?V ?W; ?V[foo->?W]) are considered to be the same schema, but (?X ?Y; ?X[foo->?Y]) and (?Y ?X; ?X[foo->?Y]) are viewed as different schemas.

A term t is an instance of an external schema (?X₁ ... ?X_n; τ) iff t can be obtained from τ by a simultaneous substitution ?X₁/s₁ ... ?X_n/s_n of the variables ?X₁ ... ?X_n with terms s₁ ... s_n, respectively. Some of the terms s_i can be variables themselves. For example, ?Z[foo->f(a ?P)] is an instance of (?X ?Y; ?X[foo->?Y]) by the substitution ?X/?Z ?Y/f(a ?P). ☐

Observe that a variable cannot be an instance of an external schema, since τ in the above definition cannot be a variable. It will be seen later that this implies that a term of the form External(?X) is not well-formed in RIF.

The intuition behind the notion of an external schema, such as (?X ?Y; ?X["foo"^^xsd:string->?Y]) or (?V; "pred:isTime"^^rif:iri(?V)), is that ?X["foo"^^xsd:string->?Y] or "pred:isTime"^^rif:iri(?V) are invocation patterns for querying external sources, and instances of those schemas correspond to concrete invocations. Thus, External("http://foo.bar.com"^^rif:iri["foo"^^xsd:string->"123"^^xsd:integer]) and External("pred:isTime"^^rif:iri("22:33:44"^^xsd:time) are examples of invocations of external terms -- one querying an external source and another invoking a built-in.

Definition (Coherent set of external schemas). A set of external schemas is coherent if there can be no term, t, that is an instance of two distinct schemas. ☐

The intuition behind this notion is to ensure that any use of an external term is associated with at most one external schema. This assumption is relied upon in the definition of the semantics of externally defined terms. Note that the coherence condition is easy to verify syntactically and that it implies that schemas like (?X ?Y; ?X[foo->?Y]) and (?Y ?X; ?X[foo->?Y]), which differ only in the order of their variables, cannot be in the same coherent set.

It important to understand that external schemas are not part of the language in RIF, since they do not appear anywhere in RIF statements. Instead, they are best thought of as part of the grammar of the language.

RIF Data Types and Built-Ins

W3C Editor's Draft 18 May 2008

Abstract

Status of this Document

May Be Superseded

Please Comment By 2008-05-25

No Endorsement

Patents

Contents

1 Naming and notational conventions used in this document

2 Symbol Spaces and Data Types

2.1 Symbol Spaces

2.2 Data Types

3 Syntax and Semantics of Built-ins

3.1 Syntax of Built-ins

3.2 Semantics of Built-ins

4 List of Supported Built-in Predicates and Functions

4.1 Guard Predicates for Datatypes

4.2 Cast functions for Datatypes and rif:iri

4.3 Numeric Functions and Predicates

4.3.1 Numeric Functions

4.3.2 Numeric Predicates

4.4 Functions and Predicates on Strings

4.4.1 Functions on Strings

4.4.2 Predicates on Strings

4.5 Functions and Predicates on Dates and Times

4.5.1 Predicates on Date and Time Values

4.5.2 Functions on Dates and Times

4.6 Functions and Predicates on rif:XMLLiterals

4.7 Functions and Predicates on rif:text

5 References

6 Appendix: Schemas for Externally Defined Terms

RIF Data Types and Built-Ins

W3C Editor's Draft 18 May 2008

Abstract

Status of this Document

May Be Superseded

Set of Documents

Please Comment By 2008-05-25

No Endorsement

Patents

Contents

1 Naming and notational conventions used in this document

2 Symbol Spaces and Data Types

2.1 Symbol Spaces

2.2 Data Types

3 Syntax and Semantics of Built-ins

3.1 Syntax of Built-ins

3.2 Semantics of Built-ins

4 List of Supported Built-in Predicates and Functions

4.1 Guard Predicates for Datatypes

4.2 Cast functions for Datatypes and rif:iri

4.3 Numeric Functions and Predicates

4.3.1 Numeric Functions

4.3.2 Numeric Predicates

4.4 Functions and Predicates on Strings

4.4.1 Functions on Strings

4.4.2 Predicates on Strings

4.5 Functions and Predicates on Dates and Times

4.5.1 Predicates on Date and Time Values

4.5.2 Functions on Dates and Times

4.6 Functions and Predicates on rif:XMLLiterals

4.7 Functions and Predicates on rif:text

5 References

6 Appendix: Schemas for Externally Defined Terms