W3C


RIF Data Types and Built-Ins

W3C Editor's Draft 22 February18 May 2008

This version:
http://www.w3.org/2005/rules/wg/draft/ED-rif-dtb-20080222/http://www.w3.org/2005/rules/wg/draft/ED-rif-dtb-20080518/
Latest editor's draft:
http://www.w3.org/2005/rules/wg/draft/rif-dtb/
Previous version:
http://www.w3.org/2005/rules/wg/draft/ED-rif-dtb-20080219/http://www.w3.org/2005/rules/wg/draft/ED-rif-dtb-20080222/ (color-coded diff)
Editors:
Axel,Axel Polleres, DERI
Harold, NRCHarold Boley, National Research Council Canada
Michael Kifer, State University of New York at Stony Brook


Abstract

Status of this Document

May Be Superseded

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document is being published as one of a set of 5 documents:

  1. RIF Basic Logic Dialect
  2. RIF Framework for Logic Dialects
  3. RIF RDF and OWL Compatibility
  4. RIF Data Types and Built-Ins (this document)
  5. RIF Use Cases and Requirements

RIF RDF and OWL CompatibilityPlease Comment By 19 February 20082008-05-25

The Rule Interchange Format (RIF) Working Group seeks public feedback on these Working Drafts. Please send your comments to public-rif-comments@w3.org (public archive). If possible, please offer specific changes to the text that would address your concern. You may also wish to check the Wiki Version of this document for internal-review comments and changes being drafted which may address your concerns.

No Endorsement

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Patents

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.



Contents 1 List of BLD built-ins 1.1 List of Supported Built-ins 1.2 Functions and Operators on Numerics 1.2.1 Operators on Numeric Values 1.2.2 Comparison Operators on Numeric Values 1.3 Functions on Strings 1.3.1 Equality and Comparison of Strings 1.3.2 Functions on String Values 1.3.3 Functions Based on Substring Matching 1.3.4 String Functions that Use Pattern Matching 1.4 Functions and Operators on Dates and Times 1.4.1 Comparison Operators on Date and Time Values 1.4.2 Component Extraction Functions on Dates and Times 1.4.3 Timezone Adjustment Functions on Dates and Time Values 2 Discussion (Open Issues) 2.1 Syntactic Representation of built-ins in RIF 2.1.1 Other proposals for the syntax // 1 List of BLD built-insThis document, developed by the Rule Interchange Format (RIF) Working Group, specifies thea list of built-insprimitive datatypes, built-in functions and built-in predicates supported by the RIF Basic Logic Dialect (http://www.w3.org/TR/rif-bld/ RIF-BLD).. Some of the RIF supported datatypes are adopted from XML Schema Part 2: Datatypes Second Edition. A large part of the definitions of the listed functions and operators are takenadapted from XQuery 1.0 and XPath 2.0 Functions and Operators (W3C Recommendation 23 January 2007). The following list.


Contents

1 Naming and notational conventions used in this document

Throughout this document we use the following prefixes for symbol spaces:

Syntax such as xsd:string should be understood as a collation order andcompact URI [CURIE] -- a form which does not. If these are really required we should define which collation orders implementations are requiredmacro that expands to support. Two options: (a) dropa concatenation of the forms which include collation arguments. These could be added in future dialects or local extensions but would not be required in conformant BLD implements; (b) includecharacter sequence denoted by the collation forms and specify that conformant implementations are only required to support collation order: http://www.w3.org/2005/xpath-functions/collation/codepoint as described in http://www.w3.org/TR/xpath-functions/#collations. *** 1.3.1 Equalityprefix xsd and Comparison of Strings 1. fn:compare fn:compare($comparand1 as xs:string?, $comparand2 as xs:string?) as xs:integer? fn:compare($comparand1 as xs:string?, $comparand2 as xs:string?, $collation as xs:string) as xs:integer? Returns -1, 0, or 1, depending on whetherthe value ofstring string. The $comparand1compact URI notation is respectively less than, equal to, or greater than the valuenot part of $comparand2, according tothe rules of the collation that is used. 1.3.2 Functions on String Values 1. fn:concat fn:concat($arg1 as xs:anyAtomicType?, $arg2 as xs:anyAtomicType?, ... ) as xs:string Accepts two or more xs:anyAtomicType argumentsRIF syntax, but rather just a space-saving device employed in this document.


2 Symbol Spaces and casts them to xs:string. ReturnsData Types

2.1 Symbol Spaces

This section reproduces the xs:string that isdefinition of symbol spaces, which are part of the concatenationsyntactic and semantic framework for RIF's logic-based dialects.

Definition (Symbol space). A symbol space is a named subset of the valuesset of its arguments after conversion. 2. fn:string-join fn:string-join($arg1 as xs:string*, $arg2 as xs:string) as xs:string Returnsall constants, Const in RIF. They are part of the logical RIF framework (RIF-FLD). Each symbol in Const belongs to exactly one symbol space.

Each symbol space has an associated lexical space, a xs:string created by concatenatingunique identifier, and, possibly, one or more aliases. More precisely,

The identifiers and aliases of $sourceString beginning atsymbol spaces are not themselves constant symbols in RIF.   ☐

To simplify the position indicatedlanguage, we will often use symbol space identifiers to refer to the actual symbol spaces (for instance, we may use "symbol space xsd:string" instead of "symbol space identified by xsd:string").

To refer to a constant in a particular RIF symbol space, we use the value of $startingLoc and continuing forfollowing presentation syntax:

     "literal"^^symspace

where literal is called the numberlexical part of characters indicated bythe valuesymbol, and symspace is an identifier or an alias of $length.the characters returned do not extend beyond $sourceString. If $startingLocsymbol space. Here literal is zero or negative, only thosea sequence of Unicode characters in positions greater than zero are returned. 4. fn:string-length fn:string-length() as xs:integer fn:string-length($arg as xs:string?) as xs:integer Returnsthat must be an xs:integer equal to the lengthelement in charactersthe lexical space of the valuesymbol space symspace. For instance, "1.2"^^xsd:decimal and "1"^^xsd:decimal are legal symbols because 1.2 and 1 are members of $arg. 5. fn:upper-case fn:upper-case($arg as xs:string?) as xs:string 6. fn:lower-case fn:lower-case($arg as xs:string?) as xs:string 7. fn:encode-for-uri fn:encode-for-uri($uri-part as xs:string?) as xs:string This function encodes reserved characters in an xs:string that is intended to be used inthe path segmentlexical space of the XML Schema data type xsd:decimal. On the other hand, "a+2"^^xsd:decimal is not a URI. Itlegal symbol, since a+2 is invertible butnot idempotent. 8. fn:iri-to-uri fn:iri-to-uri($iri as xs:string?) as xs:string 9. fn:escape-html-uri fn:escape-html-uri($uri as xs:string?) as xs:string This function escapes all characters except printable characterspart of the US-ASCII coded character set, specificallylexical space of xsd:decimal.

The octets ranging from 32set of all symbol spaces that partition Const is considered to 126 (decimal).be part of the effectlogic language of RIF-FLD.


RIF requires that all dialects include the function is to escape a URI infollowing symbol spaces. Rule sets that are exchanged through RIF can use additional symbol spaces.

and all the manner html user agents handle attribute valuessymbol spaces that expect URIs. Each character in $uricorrespond to be escaped is replaced by an escape sequence, which is formed by encodingthe character as a sequencesubtypes of octetsxsd:string as specified in UTF-8,[XML-SCHEMA2].

and then representing eachall the symbol spaces that corresponds to the subtypes of these octetsxsd:decimal as specified in [XML-SCHEMA2].


The form %HH, where HH is the hexadecimal representationlexical spaces of the octet.above symbol spaces are defined in the document [XML-SCHEMA2].

This function must always generate hexadecimal values usingsymbol space represents XML content. The upper-case letters A-F. 1.3.3 Functions Based on Substring Matchinglexical space of rdf:XMLLiteral is defined in the functions described here examine a string $arg1 to see whether it contains another string $arg2 asdocument [RDF-CONCEPTS].

This symbol space represents text strings with a substring.language tag attached. The result depends on whether $arg2 is a substringlexical space of $arg1, and if so, onrif:text is the range of characters in $arg1 which $arg2 matches. 1. fn:contains fn:contains($arg1 as xs:string?, $arg2 as xs:string?) as xs:boolean fn:contains( $arg1 as xs:string?, $arg2 as xs:string?, $collation as xs:string) as xs:boolean Returns an xs:boolean indicating whether or not the value of $arg1 contains (at the beginning, at the end, or anywhere within) at least one sequenceset of collation units that provides a minimal match to the collation units in the valueall Unicode strings of $arg2, according tothe collationform ...@LANG, i.e., strings that end with @LANG where LANG is used. "Minimal match" isa language identifier as defined in Unicode Collation Algorithm . 2. fn:starts-with fn:starts-with($arg1 as xs:string?, $arg2 as xs:string?) as xs:boolean fn:starts-with( $arg1 as xs:string?, $arg2 as xs:string?, $collation as xs:string) as xs:boolean Returns an xs:boolean indicating whether[RFC-3066].

Constant symbols that providesbelong to this symbol space are intended to be used in a minimal matchway similar to the collation unitsRDF resources [RDF-SCHEMA]. The lexical space consists of $arg2 accordingall absolute IRIs as specified in [RFC-3987]; it is unrelated to the collation that is used. "Minimal match" is defined in Unicode Collation AlgorithmXML primitive type anyURI. 3. fn:ends-with fn:ends-with($arg1 as xs:string?, $arg2 as xs:string?) as xs:boolean fn:ends-with( $arg1 as xs:string?, $arg2 as xs:string?, $collation as xs:string) as xs:boolean 4. fn:substring-before fn:substring-before($arg1 as xs:string?, $arg2 as xs:string?) as xs:string fn:substring-before( $arg1 as xs:string?, $arg2 as xs:string?, $collation as xs:string)A rif:iri constant must be interpreted as xs:string Returnsa reference to one and the substringsame object regardless of the value of $arg1context in which that constant occurs.

Symbols in this symbol space are local to the valueRIF documents in which they occur. This means that occurrences of $arg1the first occurrence of a sequencesame rif:local constant in different documents are viewed as unrelated distinct constants, but occurrences of collation units that provides a minimal match tothe collation units of $arg2 accordingsame rif:local constant in the same document must refer to the collation thatsame object. The lexical space of rif:local is used. 5. fn:substring-after fn:substring-after($arg1 as xs:string?, $arg2 as xs:string?) as xs:string fn:substring-after( $arg1 as xs:string?, $arg2 as xs:string?, $collation as xs:string)the same as xs:string 1.3.4 String Functions that Use Pattern Matchingthe three functions described here make uselexical space of a regular expression syntax for pattern matching. The regular expression syntax is described herexsd:string.

1. fn:matches fn:matches($input as xs:string?, $pattern as xs:string) as xs:boolean fn:matches( $input as xs:string?, $pattern as xs:string, $flags as xs:string) as xs:boolean The function returns true if $input matches the regular expression supplied as $pattern as influenced2.2 Data Types

Data types in RIF are symbol spaces which have special semantics. That is, each datatype is characterized by a fixed lexical space, value space and lexical-to-value-mapping.

Definition (Primitive data type). A primitive data type (or just a data type, for short) is a symbol space that has

Semantic structures are always defined with respect to a particular set of this function (omittingdata types, denoted by DTS. In a concrete dialect, DTS always includes the argument $flags) isdata types supported by that dialect. All RIF dialects must support the samefollowing primitive data types:

Their value spaces and the lexical-to-value-space mappings are defined as follows:

The value ofspace and the timezone component is an xs:dayTimeDuration.lexical-to-value-space mapping for all the date/time datatypes, the timezone property is optionalrif:text defined here are compatible with RDF's semantics for strings with named tags [RDF-SEMANTICS].

3 Syntax and maySemantics of Built-ins

3.1 Syntax of Built-ins

A RIF built-in function or may not be present. Depending on the datatype, somepredicate is a special case of externally defined terms, which are defined in RIF Framework for Logic Dialects and also reproduced in the remaining six properties must be presentdirect definition of RIF Basic Logic Dialect (RIF-BLD).

Syntactically, built-in predicates and some must be absent. Absent, or missing, propertiesfunctions in RIF are represented byas external terms of the empty sequence. This valueform:

  'External' '(' Expr ')'

where Expr is referred toa positional term as the local valuedefined in that the value isin RIF Framework for Logic Dialects (see also in RIF Basic Logic Dialect). For RIF's normative syntax, see XML Serialization Syntax for RIF-BLD.

RIF-FLD introduces the given timezone. Before comparing or subtracting xs:dateTime values, this local value must be translated or normalizednotion of an external schema to UTC. Issues: prefixes must be documented. *** 1.4.1 Comparison Operators on Date and Time Values 1. op:dateTime-equal op:dateTime-equal($arg1 as xs:dateTime, $arg2 as xs:dateTime) as xs:boolean Returns true ifdescribe both the well-formed externally defined terms and only iftheir semantics. In the valuespecial case of $arg1a RIF built-in, external schemas have especially simple form. A built-in named f that takes n arguments has the schema

  (?X1 ... ?Xn;   f(?X1 ... ?Xn)

Here f(?X1 ... ?Xn) is equal tothe value of $arg2 accordingactual term that is used to refer to the algorithm defined in section 3.2.7.4built-in (in expressions of XML Schema Part 2: Datatypes Second Edition "Order relation on dateTime" for xs:dateTime values with timezones. Returns false otherwise. 2. op:dateTime-less-than op:dateTime-less-than($arg1 as xs:dateTime, $arg2 as xs:dateTime) as xs:boolean 3. op:dateTime-greater-than op:dateTime-greater-than( $arg1 as xs:dateTime, $arg2 as xs:dateTime) as xs:boolean 4. op:date-equal op:date-equal($arg1 as xs:date, $arg2 as xs:date) as xs:boolean Returns true ifthe form External(f(?X1 ... ?Xn))) and only if?X1 ... ?Xn is the starting instantlist of $arg1all variables in that term.

For convenience, a complete definition of external schemas is equal to starting instantreproduced in Appendix: Schemas for Externally Defined Terms.


3.2 Semantics of $arg2. Returns false otherwise.Built-ins

The starting instantsemantics of an xs:dateexternal terms in RIF-FLD and RIF-BLD is the xs:dateTime at time 00:00:00 on that date. 5. op:date-less-than op:date-less-than($arg1 as xs:date, $arg2 as xs:date) as xs:boolean 6. op:date-greater-than op:date-greater-than($arg1 as xs:date, $arg2 as xs:date) as xs:boolean 1.4.2 Component Extraction Functions on Datesdefined using two mappings: Iexternal and Times 1. fn:year-from-dateTime fn:year-from-dateTime($arg as xs:dateTime?) as xs:integer? ReturnsItruth ο Iexternal.

4 List of Supported Built-in Predicates and Functions

This section defines the syntax and semantics of all built-in predicates and functions in RIF. For each built-in, the following is defined:

  1. The empty sequence, returns an xs:dateTime withoutname of the built-in.
  2. The external schema of the built-in.
  3. For a timezone. Otherwise, returns an xs:dateTime withbuilt-in function, how it maps its arguments into a timezone. 2. fn:adjust-date-to-timezone fn:adjust-date-to-timezone($arg as xs:date?) as xs:date? fn:adjust-date-to-timezone( $arg as xs:date?, $timezone as xs:dayTimeDuration?)result.

    As xs:date? Adjusts an xs:date value to a specific timezone, orexplained in Section Semantics of Built-ins, this corresponds to no timezone at all. If $timezonethe mapping Iexternal(σ) in the formal semantics of RIF-FLD and RIF-BLD, where σ is the empty sequence, returns an xs:date withoutexternal schema of the built-in.

  4. For a timezone. Otherwise, returns an xs:datebuilt-in predicate, its truth value when the arguments are substituted with a timezone.values in the domain.

    As explained in Section Semantics of Built-ins, this corresponds to the mapping Itruth ο Iexternal(σ) in the formal semantics of RIF-FLD and RIF-BLD, where σ is the external schema of the built-in.

  5. The intended domains for purposesthe arguments of timezone adjustment,the built-in.

    Typically, built-in functions and predicates are defined over the value spaces of appropriate data types. These are the intended domains of the arguments. When an xs:dateargument falls outside of its intended domain, it is treated as an xs:dateTime with time 00:00:00. 3. fn:adjust-time-to-timezone fn:adjust-time-to-timezone($arg as xs:time?)understood as xs:time? fn:adjust-time-to-timezone( $arg as xs:time?, $timezone as xs:dayTimeDuration?) as xs:time? Adjustsan xs:time value toerror. Since this document defines a specific timezone, or to no timezone at all. If $timezone ismodel-theoretic semantics for RIF built-ins, which does not support the empty sequence, returns an xs:time without a timezone. Otherwise, returnsnotion of an xs:time with a timezone. 2 Discussion (Open Issues) Areerror, the listed built-ins also those supported by Core? Syntactic Representationdefinitions leave the values of the built-in predicates and functions unspecified in such cases. This means that for different semantic structures, the value of Iexternal(σ)(a1 ... an) can be anything if one of the arguments is not in its intended domain. Similarly, Itruth ο Iexternal(σ)(a1 ... an) can be t in some interpretations and f in others.

    This indeterminacy in case of an error implies that applications must not make any assumptions about the values of built-ins in RIF (see below) Higher-ordersuch situations. Implementations are even allowed to abort in such cases and the only safe way to communicate rule sets that contain built-ins (see Axel's messageamong RIF-compliant systems is to use data type guards.


Many built-in functions and predicates described below are adapted from XQuery 1.0 and XPath 2.0 Functions and Operators and, when appropriate, we will refer to the definitions in that specification in order to avoid copying them.

4.1 Guard Predicates for Datatypes

RIF requires guard predicates for all its supported datatypes. The schemas for these predicates have the general form

( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isDATATYPE"^^rif:iri ( ?arg1 )  Binding patterns (see Axel's message)

Should BLD specify binding patternswith Iexternal( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isDATATYPE"^^rif:iri ( ?arg1 ) )(s1) = t if and only if s1 is in the value space of DATATYPE and f otherwise. Accordingly, the following schemas are defined:

 ( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isLong"^^rif:iri ( ?arg1 ) )
 ( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isInteger"^^rif:iri( ?arg1 ) )
 ( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isDecimal"^^rif:iri ( ?arg1 ) )
 ( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isString"^^rif:iri ( ?arg1 ) )
 ( ?arg1;  "http://www.w3.org/2007/rif-builtin-predicates#isTime^^rif:iri ( ?arg1 ) )
 ( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isDateTime"^^rif:iri ( ?arg1 ) )
 ( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isXMLLiteral"^^rif:iri ( ?arg1 ) ) 
 ( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isText"^^rif:iri ( ?arg1 ) )

Likewise, RIF has negative guards for all its supported datatypes. These built-in predicates have the schema

( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isNotDATATYPE"^^rif:iri ( ?arg1 ) )

with Iexternal( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isNotDATATYPE"^^rif:iri ( ?arg1 ) )(s1) = f if and only if s1 is in the value space of DATATYPE and t otherwise. Accordingly, the following schemas are defined:


 ( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isNotLong"^^rif:iri ( ?arg1 ) ) 
 ( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isNotInteger"^^rif:iri ( ?arg1 ) ) 
 ( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isNotDecimal"^^rif:iri ( ?arg1 ) ) 
 ( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isNotString"^^rif:iri ( ?arg1 ) ) 
 ( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isNotTime"^^rif:iri ( ?arg1 ) ) 
 ( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isNotDateTime"^^rif:iri ( ?arg1 ) ) 
 ( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isNotXMLLiteral"^^rif:iri ( ?arg1 ) ) 
 ( ?arg1; "http://www.w3.org/2007/rif-builtin-predicates#isNotText"^^rif:iri ( ?arg1 ) )

Future dialects may extend this list of guards to other datatypes, but RIF does not require guards for all datatypes.

4.2 Cast functions for Datatypes and rif:iri

RIF requires cast functions for all its supported datatypes, i.e.

( ?arg1; "DATATYPE"^^rif:iri ( ?arg1 ) )

where DATATYPE is one of the primitive data types defined in Section []

Iexternal( ?arg1; "DATATYPE"^^rif:iri ( ?arg1 ) )(s1) = s1'

such that s1' is the conversion of s1 to the value space of xsd:DATATYPE if and only if conversion from s1 is possible according to the table in Section 17.1 of XQuery 1.0 and XPath 2.0 Functions and Operators, and err:XPTY0004 otherwise.

Partial conversion functions between the datatypes xsd:time, xsd:string, xsd:dateTime are defined as in the conversion table in Section 17.1 of XQuery 1.0 and XPath 2.0 Functions and Operators.

Note that neither the three remaining data types xsd:long, rif:XMLLiteral, and err:, nor the symbl space rif:iri do not appear in that table, but the following considerations apply:

Although rif:iri is not a datatype, I left conversions from and to rif:iris in the list of cast functions, since I see some use cases in the context of RDF if you want to extract a string from an IRI and vice versa. See also the use case mentioned in http://lists.w3.org/Archives/Public/public-rif-wg/2008Mar/0011.html

4.3 Numeric Functions and Predicates

4.3.1 Numeric Functions

1. op:numeric-add


2. op:numeric-subtract

(?arg1 ?arg2; func:numeric-subtract( ?arg1 ?arg2) )

3. op:numeric-multiply

(?arg1 ?arg2; func:numeric-multiply( ?arg1 ?arg2) )

4. op:numeric-divide

(?arg1 ?arg2; func:numeric-divide( ?arg1 ?arg2) )

4.3.2 Numeric Predicates

1. op:numeric-equal


2. op:numeric-less-than

(?arg1 ?arg2; pred:numeric-less-than( ?arg1 ?arg2) )

3. op:numeric-greater-than

(?arg1 ?arg2; pred:numeric-greater-than( ?arg1 ?arg2) )

4.4 Functions and Predicates on Strings

Issue: In the following, we often encounter several versions of some built-ins. Since XPath and Xquery allow overloading, i.e. the same function or operator name occurring with different arities. We suggest to treat this likewise in RIF, by numbering the different versions of the respective built-ins and treating the unnumbered version as syntactic sugar, i.e. for instance instead of External( func:concat2( str1, str2) ) and </tt>External( func:concat3( str1 str2 str3 ) )</tt> we equally allow simply to write External( func:concat( str1, str2) ) and </tt>External( func:concat( str1 str2 str3 ) )</tt>. Note that this is really purely syntactic sugar, and does not mean that for external predicates and functions we lift restriction that each function and predicate has a unique assigned arity. Those schemata for which we allow this syntactic sugar, appear in the same box.

4.4.1 Functions on Strings

1. fn:compare

( ?comparand1 ?comparand2; func:compare1(?comparand1 ?comparand2) )
( ?comparand1 $comparand2 $collation; func:compare2(?comparand1 ?comparand2 ?collation) )

where Iexternal( ( ?comparand1 ?comparand2; func:compare1(?comparand1 ?comparand2) )(s1 s2) = res

such that res = -1, 0, or 1, depending on whether the value of the s1 is respectively less than, equal to, or greater than the value of s2, according to the rules of the collation that is used.


2. fn:concat

( ?arg1; func:concat1(1 ) )
( ?arg1 ?arg2; func:concat2(?arg1 ?arg2 ) )
...
( ?arg1  ?arg2 ... ?argn; func:concatn(?arg1 ?arg2 ... ?argn ) )

Accepts xs:anyAtomicType arguments and casts them to xsd:string. Returns the xsd:string that is the concatenation of the values of its arguments after conversion. Only defined if all arguments are castable to strings (see Section on Cast functions above), otherwise returns an error as defined for fn:concat.

2. fn:string-join

( ?arg1 ?arg2; func:string-join2(?arg1 ?arg2 ) )
( ?arg1 ?arg2 ?arg3; func:string-join2(?arg1 ?arg2 ?arg3 ) )
...
( ?arg1  ?arg2 ... ?argn; func:string-joinn(?arg1 ?arg2 ... ?argn ) )

Returns a xsd:string created by concatenating the arguments 1 to (n-1) using the nth argument as a separator. Only defined if all arguments are strings, otherwise returns an error as defined for fn:string-join.

3. fn:substring

( ?sourceString� ?startingLoc; fn:substring1( ?sourceString� ?startingLoc) )
( fn:substring2( ?sourceString ?startingLoc ?length) )

Returns the portion of the value of ?sourceString beginning at the position indicated by the value of ?startingLoc and continuing for the number of characters indicated by the value of ?length. The characters returned do not extend beyond ?sourceString. If ?startingLoc is zero or negative, only those characters in positions greater than zero are returned.

4. fn:string-length

( func:string-length1() )
( ?arg ; func:string-length2( ?arg ) )

Returns an xsd:integer equal to the length in characters of the argument if it is a xsd:string.

5. fn:upper-case

( ?arg ; func:upper-case( ?arg ) )

6. fn:lower-case

( ?arg ; func:lower-case( ?arg ) )

7. fn:encode-for-uri

( ?arg ; func:encode-for-uri( ?arg ) )

This function encodes reserved characters in an xs:string that is intended to be used in the path segment of a URI. It is invertible but not idempotent.

8. fn:iri-to-uri

( ?iri ; func:iri-to-uri ( ?iri ) )

9. fn:escape-html-uri

( ?uri ;func:escape-html-uri( ?uri ) )

This function escapes all characters except printable characters of the US-ASCII coded character set, specifically the octets ranging from 32 to 126 (decimal). The effect of the function is to escape a URI in the manner html user agents handle attribute values that expect URIs. Each character in $uri to be escaped is replaced by an escape sequence, which is formed by encoding the character as a sequence of octets in UTF-8, and then representing each of these octets in the form %HH, where HH is the hexadecimal representation of the octet. This function must always generate hexadecimal values using the upper-case letters A-F.

10. fn:substring-before

( ?arg1 ?arg2; func:substring-before1( ?arg1 ?arg2 ) )
( ?arg1 ?arg2 ?collation; func:substring-before2( ?arg1 ?arg2 ?collation ) )

11. fn:substring-after

( ?arg1 ?arg2; func:substring-after1( ?arg1 ?arg2 ) )
( ?arg1 ?arg2 ?collation; func:substring-after2( ?arg1 ?arg2 ?collation ) )

12. fn:replace

( ?input ?pattern ?replacement; func:replace1( ?input ?pattern ?replacement ) )
( ?input ?pattern ?replacement ?flags; func:replace2( ?input ?pattern ?replacement ?flags ) )

Editor's note: I removed fn:tokenize from previous versions, since tokenizer returns a sequence of strings, we might reconsider this if we (re-)introduce lists, but I left this out for the moment.

4.4.2 Predicates on Strings

The predicates described here examine the first string argument to see whether it contains the second string argument as a substring.

1. fn:contains

( ?arg1 ?arg2; pred:contains1( ?arg1 ?arg2 ) ) 
( ?arg1 ?arg2 ?collation ; pred:contains2( ?arg1 ?arg2 ?collation ) )

Returns true or false indicating whether or not the value of $arg1 contains (at the beginning, at the end, or anywhere within) at least one sequence of collation units that provides a minimal match to the collation units in the value of $arg2, according to the collation that is used. "Minimal match" is defined in Unicode Collation Algorithm.

Shall we return an error or false on typing errors?

2. fn:starts-with

( ?arg1 ?arg2; pred:starts-with1( ?arg1 ?arg2 )
( ?arg1 ?arg2 ?collation; pred:starts-with2( ?arg1 ?arg2 ?collation)

Shall we return an error or false on typing errors?

3. fn:ends-with

(?arg1 ?arg2; fn:ends-with1( ?arg1 ?arg2 ) )
(?arg1 ?arg2 ?collation; fn:ends-with2( ?arg1 ?arg2 ?collation) )

Shall we return an error or false on typing errors?

4. fn:matches

( ?input ?pattern; pred:matches1( ?input ?pattern) )
( ?input ?pattern ?flags; pred:matches2( ?input ?pattern ?flags ) )

The function returns true if the input matches the regular expression supplied as pattern as influenced by the flags, if present; otherwise, it returns false.

The effect of calling the first version of this function (omitting the flags) is the same as the effect of calling the second version with the flags argument set to a zero-length string.

Shall we return an error or false on typing errors?

4.5 Functions and Predicates on Dates and Times

If not stated otherwise, in the following we define schemas for functions and operators defined on the date and time datatypes in XML Schema Part 2: Datatypes Second Edition.

As defined in Section 3.3.2 Dates and Times, xsd:dateTime, xsd:date, xsd:time, xsd:gYearMonth, xsd:gYear, xsd:gMonthDay, xsd:gMonth, xsd:gDay values, referred to collectively as date/time values, are represented as seven components or properties: year, month, day, hour, minute, second and timezone. The value of the first five components are xsd:integers. The value of the second component is an xsd:decimal and the value of the timezone component is an xsd:dayTimeDuration. For all the date/time datatypes, the timezone property is optional and may or may not be present. Depending on the datatype, some of the remaining six properties must be present and some must be absent. Absent, or missing, properties are represented by the empty sequence. This value is referred to as the local value in that the value is in the given timezone. Before comparing or subtracting xsd:dateTime values, this local value must be translated or normalized to UTC.


4.5.1 Predicates on Date and Time Values

1. op:dateTime-equal

( ?arg1 ?arg2; pred:dateTime-equal( ?arg1 ?arg2) )

where Iexternal( ?arg1 ?arg2; pred:dateTime-equal( ?arg1 ?arg2 ) )(s1 s2) = t

if and only if op:dateTime-equal(s1, s2) returns true, as defined in XQuery 1.0 and XPath 2.0 Functions and Operators, f otherwise. The following schemata are defined analogously with respect to their corresponding operators as defined in XQuery 1.0 and XPath 2.0 Functions and Operators.

2. op:dateTime-less-than

( ?arg1 ?arg2; pred:dateTime-less-than(?arg1 ?arg2 ) )


3. op:dateTime-greater-than

( ?arg1 ?arg2; pred:dateTime-greater-than(?arg1 ?arg2 ) )

4. op:date-equal

( ?arg1 ?arg2; pred:date-equal( ?arg1 ?arg2) )

5. op:date-less-than

( ?arg1 ?arg2; pred:date-less-than(?arg1 ?arg2 ) )

6. op:date-greater-than

( ?arg1 ?arg2; pred:date-greater-than(?arg1 ?arg2 ) )

7. op:time-equal

( ?arg1 ?arg2; pred:time-equal( ?arg1 ?arg2) )

8. op:time-less-than

( ?arg1 ?arg2; pred:time-less-than(?arg1 ?arg2 ) )

9. op:time-greater-than

( ?arg1 ?arg2; pred:time-greater-than(?arg1 ?arg2 ) )

4.5.2 Functions on Dates and Times

1. fn:year-from-dateTime

( ?arg ; func:year-from-dateTime( ?arg ) )

where Iexternal( ?arg ; func:year-from-dateTime( ?arg ) )(s) = res

such that res is the result of fn:year-from-dateTime(s) as defined in XQuery 1.0 and XPath 2.0 Functions and Operators if s is in the value space of xsd:dateTime, otherwise returns an error as defined in XQuery 1.0 and XPath 2.0 Functions and Operators. The following schemata are defined analogously with respect to their corresponding functions and possible error codes as defined in XQuery 1.0 and XPath 2.0 Functions and Operators.


Editor's remark: Where in XQuery 1.0 and XPath 2.0 Functions and Operators are the errors for these functions defined?

We slightly deviate here from the original definition of fn:year-from-dateTime which says: "If ?arg is the empty sequence, returns the empty sequence." We have no terminology of "sequence".

Editor's remark: I am not sure whether this is a problem. Do we have to deal with empty sequences?

2. fn:month-from-dateTime

( ?arg ; func:month-from-dateTime( ?arg ) )

3. fn:day-from-dateTime

( ?arg ; func:day-from-dateTime( ?arg ) )

4. fn:hours-from-dateTime

( ?arg ; func:hours-from-dateTime( ?arg ) )

5. fn:minutes-from-dateTime

( ?arg ; func:minutes-from-dateTime( ?arg  ) )

6. fn:seconds-from-dateTime

( ?arg ; func:seconds-from-dateTime( ?arg ) )

7. fn:year-from-date

( ?arg ; func:year-from-date( ?arg ) )

8. fn:month-from-date

( ?arg ; func:month-from-date( ?arg ) )

9. fn:day-from-date

( ?arg ; func:day-from-date( ?arg ) )

10. fn:hours-from-time

( ?arg ; func:hours-from-time( ?arg ) )

11. fn:minutes-from-time

( ?arg ; func:minutes-from-time( ?arg ) )

12. fn:seconds-from-time

( ?arg ; func:seconds-from-time( ?arg ) )

4.6 Functions and Predicates on rif:XMLLiterals

We support (though not impose) an XPath built-in function, applied to XMLLiterals as follows.

(?xml ?xquery; func:evalXQuery( ?xml ?xquery) )

where Iexternal(?xml ?xquery; func:evalXQuery( ?xml ?xquery) )(s1 s2) = res such that, if s1 is in the value space of rdf:XMLLiteral and s2 is in the value space of xsd:string and s2 is a syntactically valid XQuery expression, then res is the result of the XQuery s2 on the context item being the root node of the XML document represented by s1, and otherwise if no other error as defined in XQuery applies, the error err:XQTY0028.

Editor's remark: The error code err:XQTY0028 is currently unused.

4.7 Functions and Predicates on rif:text

(?arg ; func:lang( ?arg ) )

where Iexternal(?arg ; func:lang( ?arg ) )(s) = res such that res is the language tag string of s, if s is in the value space of rif:text and ""^^xsd:string otherwise.

5 References

[RDF-CONCEPTS]
Resource Description Framework (RDF): Concepts and Abstract Syntax, Klyne G., Carroll J. (Editors), W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/. Latest version available at http://www.w3.org/TR/rdf-concepts/.

[RDF-SEMANTICS]
RDF Semantics, Patrick Hayes, Editor, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-rdf-mt-20040210/. Latest version available at http://www.w3.org/TR/rdf-mt/.

[RDF-SCHEMA]
RDF Vocabulary Description Language 1.0: RDF Schema, Brian McBride, Editor, W3C Recommendation 10 February 2004, http://www.w3.org/TR/rdf-schema/.

[RFC-3066]
RFC 3066 - Tags for the Identification of Languages, H. Alvestrand, IETF, January 2001. This document is at http://www.isi.edu/in-notes/rfc3066.txt.

[RFC-3987]
RFC 3987 - Internationalized Resource Identifiers (IRIs), M. Duerst and M. Suignard, IETF, January 2005. This document is at http://www.ietf.org/rfc/rfc3987.txt.

[XML-SCHEMA2]
XML Schema Part 2: Datatypes, W3C Recommendation, World Wide Web Consortium, 2 May 2001. This version is http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/. The latest version is available at http://www.w3.org/TR/xmlschema-2/.


6 Appendix: Schemas for Externally Defined Terms

This section is an edited copy of a section from RIF Framework for Logic Dialects. It is reproduced here for convenience of the supported built-ins? Should binding patternsreader of the RIF-BLD document who might not be specified by each RIF dialect? Note that therefamiliar with RIF-FLD.

This section defines external schemas, which serve as templates for externally defined terms. These schemas determine which externally defined terms are no binding patterns specifiedacceptable in SWRL . Time arithmetics (raised by MK) Should BLDa RIF dialect. Externally defined terms include RIF built-ins, but are more general. They are designed to also support arithmetic operations on durations, dates, and times (seeaccommodate the list from XQuery 1.0ideas of procedural attachments and XPath 2.0 )? Two additional categoriesquerying of built-ins (raised by Jos) testing whetherexternal data sources. Because of the need to accommodate many difference possibilities, the RIF logical framework supports a specific value isvery general notion of an externally defined term. Such a specific datatype E.g., isString/1 could beterm is not necessarily a function or a built-inpredicate whose extension is always testing whether-- it can be a specific valueframe, a classification term, and so on.

Definition (Schema for external term). An external schema is nota statement of the form (?X1 ... ?Xn; τ) where

The proposal is to addnames of the following SPARQLs operatorsvariables in an external schema are immaterial, but their order is important. For instance, (?X ?Y;  ?X[foo->?Y]) and (?V ?W;  ?V[foo->?W]) are considered to be the list of BLD built-ins (their definitionssame schema, but (?X ?Y;  ?X[foo->?Y]) and (?Y ?X;  ?X[foo->?Y]) are takenviewed as different schemas.

A term t is an instance of an external schema (?X1 ... ?Xn; τ) iff t can be obtained from SPARQL Query Language for RDFτ by a simultaneous substitution ?X1/s1 ... ?Xn/sn of the variables ?X1 ... ?Xn with terms s1 ... sn, W3C Proposed Recommendation, November 2007). Noterespectively. Some of the terms si can be variables themselves. For example, ?Z[foo->f(a ?P)] is an instance of (?X ?Y; ?X[foo->?Y]) by the substitution ?X/?Z  ?Y/f(a ?P).   

Observe that these operators havea variable cannot be an instance of an external schema, since τ in the general form: ReturnType OpName(Arg1Type Arg1, Arg2Type Arg2,...,ArgNType ArgN) isIRI xsd:boolean isIRI (RDF term term) xsd:boolean isURI (RDF term term) isBlank xsd:boolean isBlank (RDF term term) isLiteral xsd:boolean isLiteral (RDF term term) lang simple literal lang (literal ltrl) datatype IRI datatype (typed literal typedLit) IRI datatype (simple literal simpleLit) langMatches xsd:boolean langMatches (simple literal language-tag, simple literal language-range) Returns true if language-tag (first argument) matches language-range (second argument) perabove definition cannot be a variable. It will be seen later that this implies that a term of the basic filtering scheme definedform External(?X) is not well-formed in RFC4647 section 3.3.1. language-rangeRIF.

The intuition behind the notion of an external schema, such as (?X ?Y;  ?X["foo"^^xsd:string->?Y]) or (?V;  "pred:isTime"^^rif:iri(?V)), is a basic language range per Matchingthat ?X["foo"^^xsd:string->?Y] or "pred:isTime"^^rif:iri(?V) are invocation patterns for querying external sources, and instances of Language Tags, RFC4647 section 2.1.those schemas correspond to concrete invocations. Thus, External("http://foo.bar.com"^^rif:iri["foo"^^xsd:string->"123"^^xsd:integer]) and External("pred:isTime"^^rif:iri("22:33:44"^^xsd:time) are examples of invocations of external terms -- one querying an external source and another invoking a language-rangebuilt-in.


Definition (Coherent set of "*" matches any non-empty language-tag string. Also, it is proposed to add the following constructors for the new datatypes: Description needed here (comment by PaulaLaviniaPatranjan ) text(Text, Lang) Gensymexternal schemas). A new valueset of type rif:bNode bNode(A, ... X) 2.1 Syntactic Representationexternal schemas is coherent if there can be no term, t, that is an instance of built-ins in RIFtwo distinct schemas.   

The working group decidedintuition behind this notion is to ensure that any use a special syntax to distinguish evaluated predicates/functions (built-ins,of an external function calls) from logical predicates/functions in RIF-BLD. As such, atomic and term productions are to be extended: ATOMIC ::= Uniterm | Equal | ExtTermterm ::= Const | Var | Uniterm | ExtTerm and a new production for ExtTermis to be added.associated with at most one proposal for a syntactical symbol for built-ins is: ExtTerm ::= 'Builtin ( ' Uniterm ' ) ' 2.1.1 Other proposals forexternal schema. This assumption is relied upon in the syntax Eval proposal ExtTerm ::= 'Eval ( ' Uniterm ' ) ' Apply proposal ExtTerm ::= 'Apply ( ' Uniterm ' ) ' Comment by csma : I would prefer something like ExtTerm ::= ' ( Apply ' Const TERM* ' ) ' Presentation Syntax XML Syntax ( Apply predfunc argument 1 . . . argument n ) <ExtTerm> <op> predfunc </op> <arg> argument 1 </arg> . . . <arg> argument n </arg> </ExtTerm> & proposal (Axel's proposal) Example: Strawman fordefinition of the presentation syntax &fn:dateTime( "2006-08-15"^^xs:date "12:30:45-05:00"xs:time ) leading '&' followed by Const, ideally I (Axel, personal statement) would prefer CURIs and notsemantics of externally defined terms. Note that the coherence condition is easy to allow full IRIs here, butverify syntactically and that it seemsimplies that a CURIschemas like (?X ?Y;  ?X[foo->?Y]) and a prefix definiton mechanism are still missing(?Y ?X;  ?X[foo->?Y]), which differ only in the current BNF. ExtTerm ::= '&' Const ' ( ' TERM* ' ) ' <ExtTerm> <op> <Const type="&rif;iri"> http://www.w3.org/2005/xpath-functions/#dateTime </Const> </op> <arg><Const type="&xs;date">2006-08-15</Const></arg> <arg><Const type="&xs;time">12:30:45-05:00</Const></arg> <ExtTerm> The abstract model would likewise need toorder of their variables, cannot be extended, but this extension is minor. Comment by Jos: I do not findin the "&"-prefix solution very elegant.same coherent set.

It suggestsimportant to understand that "&" is aexternal schemas are not part of the name, or a modifierlanguage in RIF, since they do not appear anywhere in RIF statements. Instead, they are best thought of the name, where as it is meantas a modifierpart of the complete term.grammar of the language.