1824 – Functions return xs:NCName, but xs:NCName is not support in XSLT 2.0 Basic

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 1824 - Functions return xs:NCName, but xs:NCName is not support in XSLT 2.0 Basic

Summary: Functions return xs:NCName, but xs:NCName is not support in XSLT 2.0 Basic

Status:	CLOSED FIXED

Alias:	None

Product:	XPath / XQuery / XSLT
Classification:	Unclassified
Component:	Functions and Operators 1.0 (show other bugs)
Version:	Last Call drafts
Hardware:	PC Linux

Importance:	P2 normal
Target Milestone:	---
Assignee:	Ashok Malhotra
QA Contact:	Mailing list for public feedback on specs from XSL and XML Query WGs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2005-07-28 15:56 UTC by Frans Englich
Modified:	2005-09-29 10:55 UTC (History)
CC List:	0 users

See Also:

Attachments

Description Frans Englich 2005-07-28 15:56:42 UTC

Hello,

There are functions which returns instances of xs:NCName:

fn:prefix-from-QName($arg as xs:QName?) as xs:NCNAME?
fn:local-name-from-QName($arg as xs:QName?) as xs:NCNAME?

From my understanding, they guarantee to return atomic values that not only has
a value space conforming to xs:NCName, but which also are or the type
xs:NCName(or they are the empty sequence). From what I can tell, one should be
able to rely on that:

"prefix-from-QName(QName("http:example.org/", "prefix:local")) instance of
xs:NCName" is always true.

But, the type xs:NCName is not a supported type in for example XSL-T 2.0 at
Basic conformance level(from what I can tell), although I don't know if that
matters.

How is this dilemma solved? What would allow an implementation to return
xs:string instead of xs:NCName, for example?

If that is not possible, which currently is my conclusion, I would say the best
solution is to change the return types to "xs:string?".


Cheers,
Frans

Comment 1 Michael Kay 2005-07-28 19:05:49 UTC

This one is going to be a bit tricky. As it happens, it arrived just before an
XSL WG telcon and we had time for a bit of preliminary discussion.

At one time the set of types supported by a basic XSLT processor was chosen to
align with the set of types used in the signatures of F+O functions. This
alignment broke when two functions were changed to return an NCName. On the face
of it there are three solutions possible:

(a) change the functions back to returning strings (there are other functions
that create a precedent for this)

(b) change XSLT to allow xs:NCName in its set of supported types (together,
presumably with its supertypes xs:Name, xs:token, and xs:normalizedString)

(c) finesse the definition of a basic XSLT processor so that if it calls a
function that returns a value of a type that's a subtype of a recognized type,
the value is in effect promoted to the nearest recognised type. Section 2.5 of
XPath is carefully crafted to recognize the possibility of input values having a
dynamic type that not one of the statically known types, but is a subtype
thereof, and we could build on this capability.

We'll need joint meeting time with XQuery to discuss this one, which means it
won't be addressed for a month or so as there's a break in the meeting schedule.

Michael Kay (personal response)

Comment 2 Sharon Adler 2005-07-28 19:32:37 UTC

Ashok,

I had the action item from the XSL telcon to ask for review for joint discussion
of this bug.  Looks like Michael beat me to it.  He indicated his was personal;
mine is formal from the XSL WG.  Please let us know when this will be handled so
we can make sure we have proper XSL representation on the call.

Thanks.

Sharon

Comment 3 Michael Kay 2005-09-06 10:38:26 UTC

I think we can solve this as follows: (for the benefit of WG members, the white
horse comes to our rescue).

XPath says in 2.5.4:

 "An unknown schema type might be encountered, for example, if a source document
has been validated using a schema that was not imported into the static context.
In this case, an implementation is allowed (but is not required) to provide an
implementation-dependent mechanism for determining whether the unknown schema
type is derived from the expected schema type."

The situation that's occuring here is that a function (such as
local-name-from-QName) is returning a value of dynamic type xs:NCName, which in
a Basic XSLT Processor is an "unknown type". So all we have to do is to say that
a Basic XSLT Processor MUST provide a mechanism for determining whether an
unknown schema type is derived from a given schema type in the case where the
unknown schema type is a built-in type. In other words, a basic XSLT processor
needs to know that xs:NCName is derived from xs:string, even though it offers no
other support for xs:NCName.

Once we say that, I believe everything else falls into place and the problem
disappears. This is enough to ensure that the returned NCName can be used as a
string.

I think we can handle this with the following addition to XSLT section 21.1:

<new>
For a Basic XSLT Processor, schema built-in types that are not included in the
static context (for example, xs:NCName) are "unknown types" in the sense of
XPath section 2.5.4. In the language of that section, a Basic XSLT Processor
MUST be able to determine whether these unknown types are derived from known
schema types such as xs:string. The purpose of this rule is to ensure that
system functions such as fn:local-name-from-QName(), which is defined to return
an xs:NCName, behave correctly. A stylesheet that uses a Basic XSLT Processor
will not be able to test whether the returned value is an xs:NCName, but it will
be able to use it as if it were an xs:string.
</new>

Michael Kay

Comment 4 Colin Adams 2005-09-06 11:26:26 UTC

I don't follow this solution, Michael.
The sentence you quote in XPath 2.5.4 cannot apply to a Basic XSLT processor,
since it is required to raise a non-recoverable dynamic error if it encounters a
node with such a type annotation.
Why not just say that a Basic XSLT processor treats these two functions as
returning an xs:string? After all, that is the effect of your suggested solution
anyway.

Comment 5 Michael Kay 2005-09-06 12:44:32 UTC

Colin: the reason I didn't want to do it the way you suggested is that it's not
a good idea for one spec to try and override another: "ignore what X says and do
it this way instead". Rather, it's better to show that this is just a special
case of a problem that's already addressed in the XPath spec, of how to handle
values with dynamic types that aren't present in the static context. The end
effect might be the same, but it's better to build on the mechanisms already
there than to introduce an arbitrary special case. This also points the way to
solving other related problems that might arise in the future, e.g. XSLT making
calls to a function library written in XQuery. We have to adapt the wording of
XSLT 21.1 a little to handle this, but it basically works.

Michael Kay

Comment 6 Colin Adams 2005-09-06 13:46:30 UTC

Well, it imposes an additional burden on the processor, which is unnecessary for
solving the particular problem.
So, if I (for instance), were to simply implement returning an xs:string from
these two functions, would there be any way of telling that I WASN'T
implementing a general recognition of schema types desceded from the built-in
types?

Comment 7 Michael Kay 2005-09-06 14:35:32 UTC

You're always free to implement things as you like if no-one can tell the
difference. The way we describe things in the spec is only distantly related to
the way we expect real implementations to work. In this particular area - the
question of how much a processor knows about the type hierarchy - there's a lot
of abstraction going on in the spec to cater for a wide variety of
implementation architectures. For example, we have to cater for the possibility
that the XPath function library and the XSLT processor are separate components
developed by different organisations, in which case the function library really
will return an NCName, and the XSLT processor has to recognize it as such. Of
course if the library is embedded in your XSLT processor and will never be used
by anyone else, then you can short-circuit things by returning a string directly.

Michael Kay

Comment 8 David Carlisle 2005-09-06 14:44:58 UTC

As a possibly only marginally related comment,
why is it that a basic XSLT processor only supports a subset of the built in
types? In many other respects a basic xslt processor corresponds strongly to an
xquery processor that does not support the optional static typing or schema
import features, but xquery specifies all built in types must be supported in
this case.

David

Comment 9 Michael Kay 2005-09-06 15:07:19 UTC

We made the decision not to support these types in Basic XSLT because in the
absence of schemas, they offer very little useful functionality, but add a lot
to the amount of code that an implementor has to write, and to the thickness of
the manual. We were keen to keep the bar for implementors as low as we could.

Michael Kay

Comment 10 Frans Englich 2005-09-07 13:10:09 UTC

     
(I jump in with humble opinions and questions about Michael's proposed     
solution.)     
     
While Michael's solution may "ensure that the returned NCName can be used as a    
string"(I can't comment on that), I think there's another problem, that the  
type annotation is different, simply. From what I can tell, the proposed  
solution would result in that code introspecting the type annotation of the 
return value -- such as in a filter expression -- would behave differently 
depending on if it was executed on a basic of schema-aware 
implementation(unless the code catered for it). I think my point is 
theoretically correct, but I'm not sure how large the impact is in practice, it 
cannot occur at all perhaps(one cannot specify xs:NCName in basic).  
  
An in my opinion important aspect with XPath 2.0 is that there's a type system,  
and that means one can rely on that and deal with "types". However, to me it  
looks like the proposed solution makes an exception to this, that even though  
the function signature in F&O says "this function returns type Foo" it  
"sometimes" does not hold true. 
  
I find it natural that this dilemma occur. XSL-T Basic requires F&O, F&O  
requires xs:NCName, but XSL-T Basic does not have xs:NCName. When not using a  
solution similar to Michael's, two alternatives exists: removing/changing the  
functions that requires xs:NCName, or to add the type(s) to XSL-T Basic.  
Assuming the latter is ruled out, how does the discussion concerning the former  
sound? What are the disadvantages of changing the return type to xs:string?  
Why(if) is it not an option, and how does it compare to the proposed solution?  
  
The problem I see with the proposed solution is that it adds complexity and  
exceptions. Hence, I would find it interesting to look at a solution that is  
built upon the existing mechanisms.  
  
(A loose thought that springs to mind is to somehow allow basic processor to  
support derived primitive types, but it introduces well-known problems and also  
introduces what I identify as a problem.)  
  
  
Cheers,  
Frans

Comment 11 Colin Adams 2005-09-07 14:54:28 UTC

After reading the last two comments from Michael and Frans, I thought about how
I'd go about implementing Michael's suggested solution for my own product.
Here, the XPath library is totally independent of the XSLT library (dependencies
are all the other way), but it does know if it is being called by a Basic-level
XSLT processor or not.
So I think I will simply return an xs:string in this case, and an xs:NCName in
all other cases. This seems to be a practical solution, as a use will not be
able to tell the difference between this and Michael's proposed solution of
knowing that xs:NCName is derived from xs:string.
I think Frans' point about the type annotation is interesting though - if the
output from a transformation in Basic XSLT mode is used as the input to another
transformation, without serialization, then the type annotation for a text
generated by such a function cannot be either xs:string or xs:NCName, as that
would violate the requirements for a Basic-level XSLT processor.
I'm just thinking off the top of my head now, so I'm not sure if my last
sentence is relevant to anything at all or not.

Comment 12 Michael Kay 2005-09-29 10:49:43 UTC

The Working Groups reviewed this carefully.

Essentially the groups agreed that the solution outlined in my comment #3 works.

Let's reiterate. XPath says:

 "An unknown schema type might be encountered, for example, if a source document
has been validated using a schema that was not imported into the static context.
In this case, an implementation is allowed (but is not required) to provide an
implementation-dependent mechanism for determining whether the unknown schema
type is derived from the expected schema type."

The example here isn't relevant, but the basic situation is the same: there's a
type encountered at run-time (in this case, the type xs:NCName) that isn't in
the static context. In this situation we allow implementations to have some kind
of mechanism for knowing that the "undeclared" type (xs:NCName) is in fact a
subtype of a "declared" type (in this case xs:string); if it has this knowledge
then it can safely treat this value as a string.

Note that this rule is carefully designed to preserve type safety. The
implementation can use any mechanism it likes to "know" that xs:NCName is
derived from xs:string, but its deduction must be correct.

One could argue that this example involves an unknown type at compile time,
rather than at run-time. In XQuery, which does static type analysis across
modules, this would be an issue. However, in XSLT (at least formally) all type
matching is done dynamically. So the dynamic type matching rules cited in XPath
are relevant.

Concrete implementations can handle this in many different ways: the XPath
formulation was deliberately chosen to be very abstract, recognising that
product architectures would be very different here. One implementation might
recognize the type xs:NCName at run-time even though it is not recognized at
compile time. Another implementation might choose to implement the F+O functions
to know that they are running in a basic XSLT environment and simply return a
string rather than an NCName in this situation. Some implementations might have
a run-time representation of the type xs:NCName that's a actually pointer to a
schema components within a data structure that represents the full type hierarchy.

So the resolution is the following addition to XSLT section 21.1:

<new>
For a Basic XSLT Processor, schema built-in types that are not included in the
static context (for example, xs:NCName) are "unknown types" in the sense of
XPath section 2.5.4. In the language of that section, a Basic XSLT Processor
MUST be able to determine whether these unknown types are derived from known
schema types such as xs:string. The purpose of this rule is to ensure that
system functions such as fn:local-name-from-QName(), which is defined to return
an xs:NCName, behave correctly. A stylesheet that uses a Basic XSLT Processor
will not be able to test whether the returned value is an xs:NCName, but it will
be able to use it as if it were an xs:string.
</new>

Comment 13 Michael Kay 2005-09-29 10:55:40 UTC

I am now closing this bug report because the working groups believe we have a
technically sound solution. If you have any questions of understanding relating
to this closure, it may be best to take them by email discussion on the public
comments list. If you wish to formally object to the closure, please re-open the
bug.

Michael Kay
for the XSLT and XQuery Working Groups