W3C


RIF Data Types and Built-Ins

W3C Editor's Draft 22 February 2008

This version:
http://www.w3.org/2005/rules/wg/draft/ED-rif-dtb-20080222/
Latest editor's draft:
http://www.w3.org/2005/rules/wg/draft/rif-dtb/
Previous version:
http://www.w3.org/2005/rules/wg/draft/ED-rif-dtb-20080219/ (color-coded diff)
Editors:
Axel, DERI
Harold, NRC


Abstract

Status of this Document

May Be Superseded

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document is being published as one of a set of 5 documents:

  1. RIF Basic Logic Dialect
  2. RIF Framework for Logic Dialects
  3. RIF Data Types and Built-Ins (this document)
  4. RIF Use Cases and Requirements
  5. RIF RDF and OWL Compatibility

Please Comment By 19 February 2008

The Rule Interchange Format (RIF) Working Group seeks public feedback on these Working Drafts. Please send your comments to public-rif-comments@w3.org (public archive). If possible, please offer specific changes to the text that would address your concern.

No Endorsement

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Patents

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.


Contents

1 List of BLD built-ins

This document, developed by the Rule Interchange Format (RIF) Working Group, specifies the list of built-ins supported by the RIF Basic Logic Dialect (http://www.w3.org/TR/rif-bld/ RIF-BLD). The definitions of the listed functions and operators are taken from XQuery 1.0 and XPath 2.0 Functions and Operators (W3C Recommendation 23 January 2007).

The following list of built-ins for RIF-BLD is currently under discussion within the Rule Interchange Format (RIF) Working Group. The Discussion section lists a couple of open issues regarding built-ins for RIF-BLD and, more general, for other RIF dialects. They are to be used as starting point for discussion within the group and will be removed before publishing this document.

1.1 List of Supported Built-ins

1.2 Functions and Operators on Numerics

The following functions and operators are defined on the numeric datatypes defined in XML Schema Part 2: Datatypes Second Edition.

      • Issues: prefixes must be documented. ***

1.2.1 Operators on Numeric Values

1. op:numeric-add


op:numeric-add($arg1 as numeric, $arg2 as numeric) as numeric


2. op:numeric-subtract


op:numeric-subtract($arg1 as numeric, $arg2 as numeric) as numeric


3. op:numeric-multiply


op:numeric-multiply($arg1 as numeric, $arg2 as numeric) as numeric


4. op:numeric-divide


op:numeric-divide($arg1 as numeric, $arg2 as numeric) as numeric


1.2.2 Comparison Operators on Numeric Values

1. op:numeric-equal


op:numeric-equal($arg1 as numeric, $arg2 as numeric) as xs:boolean


2. op:numeric-less-than


op:numeric-less-than($arg1 as numeric, $arg2 as numeric) as xs:boolean


3. op:numeric-greater-than


op:numeric-greater-than($arg1 as numeric, $arg2 as numeric) as xs:boolean



1.3 Functions on Strings

The functions and operators are defined on the XML Schema Part 2: Datatypes Second Edition xs:string datatype and the datatypes derived from it.

      • Issues: prefixes must be documented. ***
      • Issue: Collation support. The functions compare, contains, starts-with, ends-with, string-before, string-after all have both a form that allows specification of a collation order and a form which does not. If these are really required we should define which collation orders implementations are required to support. Two options: (a) drop the forms which include collation arguments. These could be added in future dialects or local extensions but would not be required in conformant BLD implements; (b) include the collation forms and specify that conformant implementations are only required to support collation order: http://www.w3.org/2005/xpath-functions/collation/codepoint as described in http://www.w3.org/TR/xpath-functions/#collations. ***


1.3.1 Equality and Comparison of Strings

1. fn:compare


fn:compare($comparand1 as xs:string?,
           $comparand2 as xs:string?) as xs:integer?

fn:compare($comparand1   as xs:string?,
           $comparand2   as xs:string?,
           $collation    as xs:string) as xs:integer?


Returns -1, 0, or 1, depending on whether the value of the $comparand1 is respectively less than, equal to, or greater than the value of $comparand2, according to the rules of the collation that is used.

1.3.2 Functions on String Values

1. fn:concat


fn:concat($arg1  as xs:anyAtomicType?,
          $arg2  as xs:anyAtomicType?,
          ...  ) as xs:string


Accepts two or more xs:anyAtomicType arguments and casts them to xs:string. Returns the xs:string that is the concatenation of the values of its arguments after conversion.

2. fn:string-join


fn:string-join($arg1 as xs:string*, $arg2 as xs:string) as xs:string


Returns a xs:string created by concatenating the members of the $arg1 sequence using $arg2 as a separator.

3. fn:substring


fn:substring( $sourceString      as xs:string?,
              $startingLoc       as xs:double) as xs:string

fn:substring( $sourceString      as xs:string?,
              $startingLoc       as xs:double,
              $length            as xs:double) as xs:string


Returns the portion of the value of $sourceString beginning at the position indicated by the value of $startingLoc and continuing for the number of characters indicated by the value of $length. The characters returned do not extend beyond $sourceString. If $startingLoc is zero or negative, only those characters in positions greater than zero are returned.

4. fn:string-length


fn:string-length() as xs:integer

fn:string-length($arg as xs:string?) as xs:integer


Returns an xs:integer equal to the length in characters of the value of $arg.

5. fn:upper-case


fn:upper-case($arg as xs:string?) as xs:string


6. fn:lower-case


fn:lower-case($arg as xs:string?) as xs:string


7. fn:encode-for-uri


fn:encode-for-uri($uri-part as xs:string?) as xs:string


This function encodes reserved characters in an xs:string that is intended to be used in the path segment of a URI. It is invertible but not idempotent.

8. fn:iri-to-uri


fn:iri-to-uri($iri as xs:string?) as xs:string


9. fn:escape-html-uri


fn:escape-html-uri($uri as xs:string?) as xs:string


This function escapes all characters except printable characters of the US-ASCII coded character set, specifically the octets ranging from 32 to 126 (decimal). The effect of the function is to escape a URI in the manner html user agents handle attribute values that expect URIs. Each character in $uri to be escaped is replaced by an escape sequence, which is formed by encoding the character as a sequence of octets in UTF-8, and then representing each of these octets in the form %HH, where HH is the hexadecimal representation of the octet. This function must always generate hexadecimal values using the upper-case letters A-F.

1.3.3 Functions Based on Substring Matching

The functions described here examine a string $arg1 to see whether it contains another string $arg2 as a substring. The result depends on whether $arg2 is a substring of $arg1, and if so, on the range of characters in $arg1 which $arg2 matches.

1. fn:contains


fn:contains($arg1 as xs:string?, $arg2 as xs:string?) as xs:boolean

fn:contains( $arg1      as xs:string?,
             $arg2      as xs:string?,
             $collation as xs:string) as xs:boolean


Returns an xs:boolean indicating whether or not the value of $arg1 contains (at the beginning, at the end, or anywhere within) at least one sequence of collation units that provides a minimal match to the collation units in the value of $arg2, according to the collation that is used. "Minimal match" is defined in Unicode Collation Algorithm.

2. fn:starts-with


fn:starts-with($arg1 as xs:string?, $arg2 as xs:string?) as xs:boolean

fn:starts-with( $arg1      as xs:string?,
                $arg2      as xs:string?,
                $collation as xs:string) as xs:boolean


Returns an xs:boolean indicating whether or not the value of $arg1 starts with a sequence of collation units that provides a minimal match to the collation units of $arg2 according to the collation that is used. "Minimal match" is defined in Unicode Collation Algorithm.

3. fn:ends-with


fn:ends-with($arg1 as xs:string?, $arg2 as xs:string?) as xs:boolean

fn:ends-with( $arg1      as xs:string?,
              $arg2      as xs:string?,
              $collation as xs:string) as xs:boolean


4. fn:substring-before


fn:substring-before($arg1 as xs:string?, $arg2 as xs:string?) as xs:string

fn:substring-before( $arg1       as xs:string?,
                     $arg2       as xs:string?,
                     $collation  as xs:string) as xs:string


Returns the substring of the value of $arg1 that precedes in the value of $arg1 the first occurrence of a sequence of collation units that provides a minimal match to the collation units of $arg2 according to the collation that is used.

5. fn:substring-after


fn:substring-after($arg1 as xs:string?, $arg2 as xs:string?) as xs:string

fn:substring-after( $arg1        as xs:string?,
                    $arg2        as xs:string?,
                    $collation   as xs:string) as xs:string


1.3.4 String Functions that Use Pattern Matching

The three functions described here make use of a regular expression syntax for pattern matching. The regular expression syntax is described here.

1. fn:matches


fn:matches($input as xs:string?, $pattern as xs:string) as xs:boolean

fn:matches( $input       as xs:string?,
            $pattern     as xs:string,
            $flags       as xs:string) as xs:boolean


The function returns true if $input matches the regular expression supplied as $pattern as influenced by the value of $flags, if present; otherwise, it returns false.

The effect of calling the first version of this function (omitting the argument $flags) is the same as the effect of calling the second version with the $flags argument set to a zero-length string. Flags are defined in 7.6.1.1 Flags.

2. fn:replace


fn:replace( $input       as xs:string?,
            $pattern     as xs:string,
            $replacement as xs:string) as xs:string

fn:replace( $input       as xs:string?,
            $pattern     as xs:string,
            $replacement as xs:string,
            $flags       as xs:string) as xs:string


The function returns the xs:string that is obtained by replacing each non-overlapping substring of $input that matches the given $pattern with an occurrence of the $replacement string.

3. fn:tokenize


fn:tokenize($input as xs:string?, $pattern as xs:string) as xs:string*

fn:tokenize( $input      as xs:string?,
             $pattern    as xs:string,
             $flags      as xs:string) as xs:string*


This function breaks the $input string into a sequence of strings, treating any substring that matches $pattern as a separator. The separators themselves are not returned.



1.4 Functions and Operators on Dates and Times

Date/time datatype values

As defined in Section 3.3.2 Dates and Times, xs:dateTime, xs:date, xs:time, xs:gYearMonth, xs:gYear, xs:gMonthDay, xs:gMonth, xs:gDay values, referred to collectively as date/time values, are represented as seven components or properties: year, month, day, hour, minute, second and timezone. The value of the first five components are xs:integers. The value of the second component is an xs:decimal and the value of the timezone component is an xs:dayTimeDuration. For all the date/time datatypes, the timezone property is optional and may or may not be present. Depending on the datatype, some of the remaining six properties must be present and some must be absent. Absent, or missing, properties are represented by the empty sequence. This value is referred to as the local value in that the value is in the given timezone. Before comparing or subtracting xs:dateTime values, this local value must be translated or normalized to UTC.

      • Issues: prefixes must be documented. ***

1.4.1 Comparison Operators on Date and Time Values

1. op:dateTime-equal


op:dateTime-equal($arg1 as xs:dateTime, $arg2 as xs:dateTime) as xs:boolean


Returns true if and only if the value of $arg1 is equal to the value of $arg2 according to the algorithm defined in section 3.2.7.4 of XML Schema Part 2: Datatypes Second Edition "Order relation on dateTime" for xs:dateTime values with timezones. Returns false otherwise.

2. op:dateTime-less-than


op:dateTime-less-than($arg1 as xs:dateTime, $arg2 as xs:dateTime) as xs:boolean


3. op:dateTime-greater-than


op:dateTime-greater-than( $arg1  as xs:dateTime,
                          $arg2  as xs:dateTime) as xs:boolean


4. op:date-equal


op:date-equal($arg1 as xs:date, $arg2 as xs:date) as xs:boolean


Returns true if and only if the starting instant of $arg1 is equal to starting instant of $arg2. Returns false otherwise. The starting instant of an xs:date is the xs:dateTime at time 00:00:00 on that date.

5. op:date-less-than


op:date-less-than($arg1 as xs:date, $arg2 as xs:date) as xs:boolean


6. op:date-greater-than


op:date-greater-than($arg1 as xs:date, $arg2 as xs:date) as xs:boolean


1.4.2 Component Extraction Functions on Dates and Times

1. fn:year-from-dateTime


fn:year-from-dateTime($arg as xs:dateTime?) as xs:integer?


Returns an xs:integer representing the year component in the localized value of $arg. The result may be negative.

2. fn:month-from-dateTime


fn:month-from-dateTime($arg as xs:dateTime?) as xs:integer?


3. fn:day-from-dateTime


fn:day-from-dateTime($arg as xs:dateTime?) as xs:integer?


4. fn:hours-from-dateTime


fn:hours-from-dateTime($arg as xs:dateTime?) as xs:integer?


5. fn:minutes-from-dateTime


fn:minutes-from-dateTime($arg as xs:dateTime?) as xs:integer?


6. fn:seconds-from-dateTime


fn:seconds-from-dateTime($arg as xs:dateTime?) as xs:decimal?


7. fn:timezone-from-dateTime


fn:timezone-from-dateTime($arg as xs:dateTime?) as xs:dayTimeDuration?


Returns the timezone component of $arg if any. If $arg has a timezone component, then the result is an xs:dayTimeDuration that indicates deviation from UTC; its value may range from +14:00 to -14:00 hours, both inclusive. Otherwise, the result is the empty sequence.

8. fn:year-from-date


fn:year-from-date($arg as xs:date?) as xs:integer?


9. fn:month-from-date


fn:month-from-date($arg as xs:date?) as xs:integer?


10. fn:day-from-date


fn:day-from-date($arg as xs:date?) as xs:integer?


11. fn:timezone-from-date


fn:timezone-from-date($arg as xs:date?) as xs:dayTimeDuration?


12. fn:hours-from-time


fn:hours-from-time($arg as xs:time?) as xs:integer?


13. fn:minutes-from-time


fn:minutes-from-time($arg as xs:time?) as xs:integer?


14. fn:seconds-from-time


fn:seconds-from-time($arg as xs:time?) as xs:decimal?


15. fn:timezone-from-time


fn:timezone-from-time($arg as xs:time?) as xs:dayTimeDuration?


1.4.3 Timezone Adjustment Functions on Dates and Time Values

1. fn:adjust-dateTime-to-timezone


fn:adjust-dateTime-to-timezone($arg as xs:dateTime?) as xs:dateTime?

fn:adjust-dateTime-to-timezone( $arg      as xs:dateTime?,
                                $timezone as xs:dayTimeDuration?) as xs:dateTime?


Adjusts an xs:dateTime value to a specific timezone, or to no timezone at all. If $timezone is the empty sequence, returns an xs:dateTime without a timezone. Otherwise, returns an xs:dateTime with a timezone.

2. fn:adjust-date-to-timezone


fn:adjust-date-to-timezone($arg as xs:date?) as xs:date?

fn:adjust-date-to-timezone( $arg         as xs:date?,
                            $timezone    as xs:dayTimeDuration?) as xs:date?


Adjusts an xs:date value to a specific timezone, or to no timezone at all. If $timezone is the empty sequence, returns an xs:date without a timezone. Otherwise, returns an xs:date with a timezone. For purposes of timezone adjustment, an xs:date is treated as an xs:dateTime with time 00:00:00.

3. fn:adjust-time-to-timezone


fn:adjust-time-to-timezone($arg as xs:time?) as xs:time?

fn:adjust-time-to-timezone( $arg         as xs:time?,
                            $timezone    as xs:dayTimeDuration?) as xs:time?


Adjusts an xs:time value to a specific timezone, or to no timezone at all. If $timezone is the empty sequence, returns an xs:time without a timezone. Otherwise, returns an xs:time with a timezone.



2 Discussion (Open Issues)

  1. Are the listed built-ins also those supported by Core?
  2. Syntactic Representation of Built-ins in RIF (see below)
  3. Higher-order Built-ins (see Axel's message)
  4. Binding patterns (see Axel's message)
    1. Should BLD specify binding patterns for the supported built-ins?
    2. Should binding patterns be specified by each RIF dialect?
    3. Note that there are no binding patterns specified in SWRL.
  5. Time arithmetics (raised by MK)
    1. Should BLD also support arithmetic operations on durations, dates, and times (see the list from XQuery 1.0 and XPath 2.0)?
  6. Two additional categories of built-ins (raised by Jos)
    1. testing whether a specific value is of a specific datatype
      • E.g., isString/1 could be a built-in predicate whose extension is always
    2. testing whether a specific value is not of a specific datatype.
      • E.g., isNotString/1 could be a built-in predicate whose extension is
  7. BLD built-ins and RDF (raised by Dave in this email)
ReturnType OpName(Arg1Type Arg1, Arg2Type Arg2,...,ArgNType ArgN)

xsd:boolean   isIRI (RDF term term)
xsd:boolean   isURI (RDF term term)

xsd:boolean   isBlank (RDF term term)

xsd:boolean   isLiteral (RDF term term)

simple literal   lang (literal ltrl)

IRI   datatype (typed literal typedLit)
IRI   datatype (simple literal simpleLit)

xsd:boolean  langMatches (simple literal language-tag,
                          simple literal language-range)

Returns true if language-tag (first argument) matches language-range (second argument) per the basic filtering scheme defined in RFC4647 section 3.3.1. language-range is a basic language range per Matching of Language Tags, RFC4647 section 2.1. A language-range of "*" matches any non-empty language-tag string. Also, it is proposed to add the following constructors for the new datatypes:

text(Text, Lang)

bNode(A, ... X)


2.1 Syntactic Representation of built-ins in RIF

   ATOMIC      ::= Uniterm | Equal | ExtTerm

   TERM        ::= Const | Var | Uniterm | ExtTerm

and a new production for ExtTerm is to be added. One proposal for a syntactical symbol for built-ins is:

   ExtTerm     ::= 'Builtin ( ' Uniterm ' ) '


2.1.1 Other proposals for the syntax

  1. Eval proposal
   ExtTerm     ::= 'Eval ( ' Uniterm ' ) '


  1. Apply proposal
   ExtTerm     ::= 'Apply ( ' Uniterm ' ) '

Comment by csma: I would prefer something like

   ExtTerm      ::= ' ( Apply ' Const TERM* ' ) '



Presentation Syntax XML Syntax
 ( Apply predfunc
  argument1
  . . .
  argumentn
          )
<ExtTerm>
  <op>predfunc</op>
  <arg>argument1</arg>
   . . .
  <arg> argumentn</arg>
</ExtTerm>


  1. & proposal (Axel's proposal)
    • Example: Strawman for the presentation syntax
 &fn:dateTime( "2006-08-15"^^xs:date "12:30:45-05:00"xs:time )

leading '&' followed by Const, ideally I (Axel, personal statement) would prefer CURIs and not to allow full IRIs here, but it seems that a CURI and a prefix definiton mechanism are still missing in the current BNF.

      ExtTerm     ::= '&' Const ' ( ' TERM* ' ) '

      <ExtTerm>
       <op>
        <Const type="&rif;iri">
          http://www.w3.org/2005/xpath-functions/#dateTime
        </Const>
       </op>
       <arg><Const type="&xs;date">2006-08-15</Const></arg>
       <arg><Const type="&xs;time">12:30:45-05:00</Const></arg>
      <ExtTerm>

The abstract model would likewise need to be extended, but this extension is minor. Comment by Jos: I do not find the "&"-prefix solution very elegant. It suggests that "&" is a part of the name, or a modifier of the name, where as it is meant as a modifier of the complete term.