Re: [DTB] summary of editorial issues (completes ACTION-552) from Axel Polleres on 2008-08-27 (public-rif-wg@w3.org from August 2008)

From: Axel Polleres <axel.polleres@deri.org>
Date: Wed, 27 Aug 2008 18:55:51 +0100
To: Jos de Bruijn <debruijn@inf.unibz.it>
CC: "Public-Rif-Wg (E-mail)" <public-rif-wg@w3.org>
Message-ID: <48B59527.5040208@deri.org>
Jos de Bruijn wrote:
> 
> Axel Polleres wrote:
>> Jos de Bruijn wrote:
>>> <snip/>
>>>
>>>> 3) In the course of the rdf:text discussions, we discussed that a
>>>> function/predicate for implementing language-pattern matching according
>>>> to subtag matching according to RFC4647 is needed. (This is not yet
>>>> reflected by an editor's not in the current draft.) I propose
>>>> to add:
>>>>
>>>> pred:matches-langtag( ?arg1 , ?arg2 )
>>>>
>>>>  intended domains:
>>>>    - arg1 rdf:text
>>>>    - arg2 valid language range according to
>>>>        http://www.rfc-editor.org/rfc/rfc4647.txt
>>> Why would this be necessary/useful?  We already have a function for
>>> extracting language tags.
>>>
>>> pred:matches-langtag( ?arg1 , ?arg2 )
>>> is the same as
>>> func:lang(?arg1)=?arg2
>> jos, if you mean that the same functionality could be achieved with
>>
>>   pred:matches
>>
>> (http://www.w3.org/2005/rules/wiki/DTB#pred:matches_.28adapted_from_fn:matches.29)
>>
>>
>> then the answer is: yes and no
> 
> Wouldn't the answer be: yes, but language patterns generally have a more
> convenient syntax?
> 
>> lang-pattern-matching in
>>  http://www.rfc-editor.org/rfc/rfc4647.txt
>> is different from regular expression  matching in pred:matches.
>>
>> For example, the extended language range "en-*-US" maps to "en-US"
>> (English, United States), also matching is case insensitive, which is
>> quite different from matching a regexp (although I don't say it can't
>> all be expressed in a regexp, this regexp might become fairly nasty)
> 
> In rif:text, language tags are lower case.  What about rdf:text?
> 
>> Since lang-pattern wildcards are still something which seems to be often
>> used in connection with language tags, I suggest to have a separate
>> function for that.
> 
> I wonder how often they are actually used and whether we want to require
> every BLD implementer to implement this specific kind of pattern matching.

DTB is a catalogue of functions and preds. If we stick with BLD 
referencing DTB functions by "import all" instead of references, then we do.

>>> <snip/>
>>>
>>>> 5) Editor's Note: It was noted in discussions of the working group, that
>>>> except guard predicates, also an analogous built-in function or
>>>> predicate to SPARQL's datatype function is needed. This however has some
>>>> technical implications, see
>>>> http://lists.w3.org/Archives/Public/public-rif-wg/2008Jul/0096.html
>>>>
>>>> PROPOSED: We could  - analogous to  pred:iri-to-string, define
>>>> predicates
>>>>
>>>>  pred:matches-datatype( ?arg1 ?arg2)
>>>>
>>>> such that the predicate is true iff ?arg1 is in the value space of
>>>> the datatype denoted by ?arg2 . An open question is whether we should
>>>> use the rif:iri or the string representing the datatypeIRI for the
>>>> second argument, i.e. what is the intended domain for ?arg2 ??
>>> I don't really see how this could be defined in a meaningful way.  In
>>> any case, we already have the guard predicates, so I don't see the use.
>> The use case is simple: I want to emulate the datatype function from
>> SPARQL in RIF... I want to know whether a literal is an integer or a
>> decimal. It is quite obvious, that we can't define a function which does
> 
> Other than in SPARQL, in BLD you do not have literals, but you have
> values.  So, for example, every integer is a decimal.
 >
> People who expect SPARQL-like behavior might find this odd.

I see the point, that means, the answer to the question whether we can 
emulate at least the SPARQL builtins in some way in RIF is then "no" 
which is annoying...


>> this, the predicate is an alternative suggestion, I will not fight for
>> it if no one else sees the need to at least cover the expressivity of
>> the built-ins in SPARQL... although I find this awkward at least.

If I wanted a rules dialect which covers these builtins - say based on 
datalog, it now breaks my hopes that this could be done in a way 
building on BLD/FLD semantics which I consider unpleasant, if not say 
disappointing. If this is really the case, I would tend to make this an 
issue, actually.

>>  <snip/>
>>>> 7) Editor's Note: In the following, we adapt several cast functions from
>>>> [XPath-Functions]. Due to the subtle differences in e.g. error handling
>>>> between RIF and [XPath-Functions], these definitions might still need
>>>> refinement in future versions of this draft.
>>>>
>>>> Indeed I need to check back Jos exact concerns here, he thought that
>>>> referring to the [XPath-Functions] conversions is not precise enough
>>>> here, see also 8)
>>> Basically, the interpretations of the functions are not completely
>>> defined.
>> yes.
>>
>>>> 8) Editor's Note: We might split this subsection into separate
>>>> subsections per casting function in future versions of this document,
>>>> following the convention of having one separate subsection per
>>>> funtcion/predicate in the rest of the document. However, it seemed
>>>> convenient here to group the cast functions which purely rely on XML
>>>> Schema datatype casting into one common subsection.
>>>>
>>>> I can separate them, if the majority of the working group thinks this is
>>>> necessary.
>>> I'd say: either follow the principle of having one subsection per
>>> predicate/function (I personally don't see the use of that) or don't
>>> follow this principle.
>>> in the former case, you need to split up the mentioned subsection.  In
>>> the latter case, many subsections in the document can be merged.
>> yes.
>>
>>>> 9) Editor's Note: The cast from rif:text to xs:string is still under
>>>> discussion, i.e. whether the lang tag should be included when casting to
>>>> xs:string or not.
>>>>
>>>> PROPOSED. replace rif:text by rdf:text, otherwise leave as is.
>>> I don't remember whether we discussed this in the working group.
>> yes, it needs to be discussed/approved. casts from rdf:text to xs:string
>> are not covered by standard conversions in XPath/XQuery, but the
>> suggested treatment covers it analogously to:
>>
>>    http://www.w3.org/TR/rdf-sparql-query/#func-str
> 
> This is not a cast function; it is a string extraction function,
> analogous to the language tag extraction function.
> I think it is more intuitive to use such an extraction function, rather
> than a cast function, for extracting strings from rdf:text values.

Fair enough, I could live with that.

>>> <snip/>
>>>
>>>> 12) Editor's Note: The working group is currently discussing, whether in
>>>> addition to adopting the fn:compare function from [XPath-Functions], own
>>>> predicates pred:string-equal, pred:string-less-than,
>>>> pred:string-greater-than, pred:string-not-equal,
>>>> pred:string-less-than-or-equal, pred:string-greater-than-or-equal not
>>>> defined in [XPath-Functions] shall be introduced, following the
>>>> convention of having such predicates for other datatypes.
>>>>
>>>> PROPOSED: introduce additional comparison predicates.
>>> Why would we want to have these comparison predicates and what does it
>>> mean for one string to be less than another?
>> Suggested by Gary, the idea is to have uniformity, i.e. predicates
>> less-than, greater-than, equal, less-than-or-equal,
>> greater-than-or-equal, for all (or ate least most) datatypes,
> 
> In that case you would need this kind of comparison also for things like
> XML literal and rdf:text.

no problem with that.

>> where this
>> can be defined in a feasible manner.
> 
> How can this be defined for strings?  This was my question.

lexical comparison, left to right, starting from the first character.

>> If there is disagreement here for the sake of redundancy,
>> then we also have to revisit the and  less-than-or-equal,
>> greater-than-or-equal predicates which were approved by the group, since
>> they are likewise superfluous.
> 
> I don't care about the redundancy here.

good.

>>>> 13) Editor's Note: No less-than-or-equal or greater-than-or-equal
>>>> predicates are defined in this draft for durations, since there are no
>>>> separate op:dayTimeDuration-equal nor
>>>> op:yearMonthDuration-equalpredicates in [XPath-Functions], but only a
>>>> common predicate op:duration-equal. Future versions of this working
>>>> draft may resolve this by introducing new equality predicates
>>>> pred:dayTimeDuration-equal and pred:yearMonthDuration-equal with
>>>> restricted intended domains.
>>>>
>>>> PROPOSED: introduce a single predicate duration-equal that only
>>>> evaluates to true if the arguments are both of the same duration subtype
>>>> and equal.
>>> Agreed.
>>>
>>>> 14) Editor's Note: Predicates for rdf:XMLLiteral such as at least
>>>> comparison predicates (equals, not-equals) are still under discussion in
>>>> the working group.
>>>>
>>>> PROPOSED: introduce equals and not-equals for XMLLiteral which matches
>>>> modulo white-spaces in non-text content.
>>> Two XML literals are equal if their values (as defined in [1]) are the
>>> same and not-equal if their values are not the same. I cannot imagine
>>> any other meaningful definition for equality of XML literals.
>>>
>>> [1] http://www.w3.org/TR/rdf-concepts/#section-XMLLiteral
>> ok, that doesn't include white-space normalization or alike...
> 
> If you want to have whitespace normalization, you should either use a
> different data type or introduce a function for this kind of
> normalization. Using XMLLiteral-equals for checking anything but
> equality of XMLLiteral values is misleading.

I didn't see any function like that for XML, does anyone have an opinion 
about it? Anyway, such a function might be a good idea indeed.

>> for that actually "=" suffices, doesn't it?
> 
> before the equals, yes.  Not-equals would be a different thing.

yup.

>> If the group is fine with
>>
>> pred:XMLLiteral-not-equals("<a/>"^^rdf:XMLLiteral
>>                            "<a />"^^rdf:XMLLiteral)
>>
>> then fair enough. As far as I understood, XML prescribes some
>> normalization of end-of-lines
>>
>>  http://www.w3.org/TR/2000/REC-xml-20001006#sec-line-ends
>>
>> and for white spaces in attribute values
>>
>>  http://www.w3.org/TR/2000/REC-xml-20001006#AVNormalize
>>
>> Do we need to bother about this?

Any opinions about this?


Thanks for all the comments,
Axel

>>>> 15) Editor's Note: The current name of this function is still under
>>>> disscussion in the working group. Alternative proposals include e.g.
>>>> func:lang-from-text, which follows the XPath/XQuery naming convention
>>>> for extraction functions from datatypes than the SPARQL naming
>>>> convention.
>>>>
>>>> PROPOSED: change to func:lang-from-text and only add a remark that this
>>>> is related to SPARQL's lang-function.
>>> Agreed.
>>>
>>>> 16) Editor's Note: We have not yet included comparison predicates
>>>> (equal, less-than, greater-than, or compare ...) for rif:text. Future
>>>> versions of this document might introduce these.
>>>>
>>>> PROPOSED: only add equal and not-equal for rdf:text, for more
>>>> sophisticated comparisons conversions to strings and the more
>>>> fine-grained comparisons on  strings can be used.
>>> Agreed.
>>>
>>>
>>>
>>> Best, Jos
>>>
>>
> 


-- 
Dr. Axel Polleres, Digital Enterprise Research Institute (DERI)
email: axel.polleres@deri.org  url: http://www.polleres.net/

Everything is possible:
rdfs:subClassOf rdfs:subPropertyOf rdfs:Resource.
rdfs:subClassOf rdfs:subPropertyOf rdfs:subPropertyOf.
rdf:type rdfs:subPropertyOf rdfs:subClassOf.
rdfs:subClassOf rdf:type owl:SymmetricProperty.
Received on Wednesday, 27 August 2008 17:56:33 UTC