This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
4.7.1 / Norm "we normalize each unit individually and construct a sequence of the normalized results interleaved with empty text nodes." This appears to cause failures in STA. Consider: <e>{ attribute a1 {"v1"} }{ attribute a2 {"v2"} }</e> which is normalized to (roughly): element e { fs:item-sequence-to-node-sequence(( attribute a1 {"v1"}, text {""}, attribute a2 {"v2"} )) } {} The problem here is that there is an attribute node after a text node. This isn't a problem for DEv, because empty text nodes are deleted before checking for misplaced attribute nodes, but it is a problem for STA. Currently, STA fails for the call to fs:item-sequence-to-node-sequence() If (as I suggest in Bug 3760) we drop that call from normalization, then the failure happens when we try to typecheck the element constructor. Are such failures intentional?
I don't think this is the intended semantics. This looks like a pretty serious problem for statically typed implementation. Considering how late we are in the process, I think I would recommend a ugly fix, which would let a combination of attribute and text nodes type check in the beginning of the type. Something like the following: statEnv |- Type <: (attribute|text)*, (element|document|text|processing-instruction|comment|xs:string|xs:float| ...|xs:NOTATION)* statEnv |- (FS-URI,"item-sequence-to-node-sequence") (Type) : attribute*, (element|text|processing-instruction|comment)*
The following rule may look a bit more natural: statEnv |- Type <: (attribute*, (element|document|processing-instruction|comment|xs:string|xs:float| ...|xs:NOTATION)*) & text* -------------------------------------------------------------------------- statEnv |- (FS-URI,"item-sequence-to-node-sequence") (Type) : attribute*, (element|text|processing-instruction|comment)*
The XSLT and XML Query WGs have decided to adopt the proposed change in comment #2, addressing that bug. Best, - Jerome, On Behalf of the WGs
The Rec does not reflect the change proposed in Comment #2 (i.e., loosening the input type for fs:item-sequence-to-node-sequence in 7.1.5). Instead, the loosening was applied to the type of the content expression of computed element constructors in 4.7.3.1 / STA. But that doesn't solve the original problem, because it's the call to the function, not the constructor, that sees the interleaved text nodes.
This issue has been entered as FS erratum E012. As a fix, I have undone the changes to 4.7.3.1 / STA and committed the change to 7.1.5 given in Comment #2. Consequently, I'm marking this issue resolved-FIXED, and CLOSED.
It occurs to me that this approach is unsound. For instance, consider: element e { text {"foo"}, attribute a {"x"} } The rules given in comment #1 or comment #2 will accept this and assign it a static type, but we know that dynamically it will raise a type error, because an attribute node follows a non-attribute node in the element constructor's content sequence.
Moreover, the other case of interleaving empty text nodes, in the normalization of direct attribute constructors (4.7.1.1), doesn't work. Consider the direct attribute constructor: a="{4}{2}" which is currently normalized to attribute a { fs:item-sequence-to-untypedAtomic(( (4), text {""}, (2) )) } At evaluation time, the value passed to the function is 4, text {""}, 2 Section 7.1.7 says that it applies the rules in XQuery 3.7.3.2, so: (1) Atomize the sequence, yielding: 4, "", 2 (2) Each of these atomic values is cast into a string: "4", "", "2" (3) Merge these strings by concatenation with a single space between each pair: "4 2" (A space between "4" and "", and another between "" and "2".) The resulting string becomes the string-value of the new attribute node. But in fact the value of the attribute is "42".
Re the type-unsoundness problem of comment #6, here is my proposed solution. It eliminates the interleaved text{""} nodes from the normalization of direct element constructors. To "distinguish" enclosed expressions, each is instead normalized to a separate function call. Roughly speaking, we split the (intended) semantics of fs:item-sequence-to-node-sequence into two fs functions, called fs:A and fs:B here for brevity. With respect to XQuery 3.7.1.3, fs:A represents step 1e (the processing of enclosed expressions, including node-copying and all that that entails), and fs:B represents steps 2 through 4 (the processing of the constructor's whole content sequence). (I'd suggest that fs:A inherit the name fs:item-sequence-to-node-sequence, and fs:B get the name fs:element-content-sequence.) The specific changes to rules would be as follows. (Of course there would be collateral changes to the prose and examples.) 4.7.1 Direct Element Constructors / Norm / rule 3 Change fs:item-sequence-to-node-sequence to fs:B Delete the interleaved text{""} items. 4.7.1 Direct Element Constructors / Norm / rule 8 Change to: [[ { Expr } ]]_ElementContentUnit == fs:A(( [[ Expr ]]_Expr )) 4.7.3.1 Computed Element Constructors / Norm / rule 2+3 Change fs:item-sequence-to-node-sequence(...) to fs:B( fs:A(...) ) 7.1.5 The fs:item-sequence-to-node-sequence function Split it into sections for fs:A and fs:B as follows. For brevity here, I leave out the "statEnv |-" and use the following abbreviations: Child_Type -> (element*|text|processing-instruction*|comment) A(Type) -> (FS-URI,"A")(Type) B(Type) -> (FS-URI,"B")(Type) Also for brevity, I leave out the "statEnv |-". The STA rules for fs:A would be: Type <: attribute** --------------------- A(Type) : attribute** Type <: (Child_Type|document|xs:anyAtomicType)* -------------------------------------------------------------- A(Type) : (Child_Type|document)* Type <: attribute**, (Child_Type|document|xs:anyAtomicType)* -------------------------------------------------------------- A(Type) : attribute**, (Child_Type|document)* The STA rule for fs:B would be: Type <: attribute**, (Child_Type|document)* --------------------------------------------- B(Type) : attribute**, Child_Type*
Concerning comment #7, I was wondering why Saxon doesn't have this problem. The answer is that it only uses the "empty text node" trick for element content, not for attribute content. For attribute content, xx="{a}{b}c{d}" is translated into attribute {'xx'} {concat(string-join(a, ' '), string-join(b, ' '), 'c', string-join(d, ' '))} which I think is perfectly sound.
(In reply to comment #9) > For attribute content, xx="{a}{b}c{d}" is translated into > > attribute {'xx'} {concat(string-join(a, ' '), string-join(b, ' '), 'c', > string-join(d, ' '))} Consider the case where a (or b or d) yields a sequence of integers. The latter translation would raise a type error re string-join's first argument, whereas the direct constructor would cast the integers into strings.
To address the problem with attribute constructors shown in comment #7, I propose a similar fix to that outlined for element constructors in comment #8. Here, we split fs:item-sequence-to-untypedAtomic into two functions, fs:C and fs:D. With respect to XQuery 3.7.1.1, fs:C represents step 2, and fs:D represents step 3. (For real names, fs:C could be fs:item-sequence-to-string-attr, and fs:D() could be fs:attribute-content-sequence, or maybe just fn:string-join(_,'') .) 4.7.1.1 Attributes / Norm / rule 4: Change fs:item-sequence-to-untypedAtomic to fs:D. Delete the interleaved text{""} items. 4.7.1.1 Attributes / Norm / rule 6: Change to: [[ { Expr } ]]_AttributeContentUnit == fs:C(( [[ Expr ]]_Expr )) 4.7.3.2 Computed Attribute Constructors / Norm / rule 2+3: Change fs:item-sequence-to-untypedAtomic(...) to fs:D( fs:C(...) ) 7.1.7 The fs:item-sequence-to-untypedAtomic function Split into: fs:C( $items as item()* ) as xs:string fs:D( $strings as xs:string* ) as xs:untypedAtomic Both are typed as declared, no special rules.
At meeting 360, the WG endorsed the changes proposed in comments #8 and #11. This will eventually be reflected by an erratum on the FS spec.
Document construction is normalized with the rule: [document { Expr }]Expr == document { fs:item-sequence-to-node-sequence-doc(( [Expr]Expr)) } In FS 7.1.6, the fs:item-sequence-to-node-sequence-doc function is described as "applying the normative rules numbered 1, 2, 3 after the sentence "Processing of the document node constructor then proceeds as follows:" in Section 3.7.3.3 Document Node Constructors." However, in XQ 3.7.3.3, the preceding paragraph states that the "content expression of a document node constructor is processed in exactly the same way as an enclosed expression in the content of a direct element constructor, as described in Step 1e of 3.7.1.3 Content." This application of Step 1e isn't captured by the current fs:item-sequence-to-node-sequence-doc. The function fs:A (the 'new' fs:item-seuqnece-too-node-sequence) is now defined to perform this step. Therefore I suggest that document constructors should normalize to: [document { Expr }]Expr == document { fs:E(fs:A(( [Expr]Expr))) } where fs:E might be called fs:document-content-sequence for consistency with Micheal's changes. I realise that fs:item-sequence-to-node-sequence-doc could just be redefined to apply Step 1e, but since there is already a function defined to apply that step, it does seem sensible and consistent to describe the normalization with two functions.
(In reply to comment #13) > > This application of Step 1e isn't captured by the current > fs:item-sequence-to-node-sequence-doc. Yes, this was pointed out in Bug 3655 comment #1, which is awaiting processing. > Therefore I suggest that document constructors should normalize to: > > [document { Expr }]Expr > == > document { fs:E(fs:A(( [Expr]Expr))) } > > where fs:E might be called fs:document-content-sequence for consistency with > Micheal's changes. > > I realise that fs:item-sequence-to-node-sequence-doc could just be redefined > to apply Step 1e, (and that was the original plan for resolving Bug 3655) > but since there is already a function defined to apply that step, it does > seem sensible and consistent to describe the normalization with two functions. I agree.
In order to properly support the uses of fs:item-sequence-to-node-sequence in sections 4.4.1 Insert and 4.4.3 Replace of the XQuery Update CR (see Bug 5666 comment #0), I propose the following tweak to the above fixes. Recall that the semantics of fs:B are: 1) Replace each document node by its children. 2) Merge adjacent text nodes, delete empty text nodes. 3) Raise an error if an attribute node follows a non-attribute node. The starting point for the tweak is to move step 1 from fs:B up to fs:A. (This is valid because fs:B encounters a document node if and only if fs:A emits one, and because the replacement is context-independent.) So fs:A's output type no longer includes 'document', which means that fs:A's static typing now achieves all the type-transforms that fs:B(fs:A(...)) used to. Which means that fs:B now has no real need to appear in the normalized query, and can instead dissolve into the dynamic semantics of computed element constructors. Note that those semantics already enforce step 3, so the only thing left is for them to handle step 2. So, here is the tweaked version of the fix in comment #8. (I assume that fs:A inherits the name fs:item-sequence-to-node-sequence.) 4.7.1 Direct Element Constructors / Norm / rule 3 Delete fs:item-sequence-to-node-sequence and all the parens. Delete the interleaved text{""} items. 4.7.1 Direct Element Constructors / Norm / rule 8 Change to: [[ { Expr } ]]_ElementContentUnit == fs:item-sequence-to-node-sequence(( [[ Expr ]]_Expr )) 4.7.3.1 Computed Element Constructors / Dyn Ev / rule 1+2 After statEnvn; dynEnv |- Expr0 => Value0 insert Value0 with text nodes prepared is Value1 and change subsequent occurrences of Value0 to Value1. The latter is an informally defined auxiliary judgment that implements: Merge adjacent text nodes; delete empty text nodes. 7.1.5 The fs:item-sequence-to-node-sequence function Change its STA as follows: For brevity here, I leave out the "statEnv |-" and use the following abbreviations: Child_Type -> (element*|text|processing-instruction*|comment) A(Type) -> (FS-URI,"item-sequence-to-node-sequence")(Type) The STA rules would be: Type <: attribute** --------------------- A(Type) : attribute** Type <: (Child_Type|document|xs:anyAtomicType)* -------------------------------------------------------------- A(Type) : Child_Type* Type <: attribute**, (Child_Type|document|xs:anyAtomicType)* -------------------------------------------------------------- A(Type) : attribute**, Child_Type* And here's the tweaked version of the fix for document constructors in comment #13, 4.7.3.3 Document Node Constructors / Norm / rule 1 Change fs:item-sequence-to-node-sequence-doc to fs:item-sequence-to-node-sequence 4.7.3.3 Document Node Constructors / Dyn Ev / rule 1+2 Insert a premise of the form ValueX with text nodes prepared is ValueY as appropriate. 7.1.6 The fs:item-sequence-to-node-sequence-doc Drop the section. ======================================================================== With the above changes, the references to fs:item-sequence-to-node-sequence in XQuery Update sections 4.4.1 Insert and 4.4.3 Replace would become correct, both statically and dynamically.
The proposal in the preceding comment was approved by the WGs at meeting #369 on 2008-06-03.
The element aspect of this issue has been entered as FS erratum E029, and the fix from comment #15 has been committed to the source files for the next edition of the FS document.
The attribute aspect of this issue has been entered as FS erratum E031, and the fix from comment #11 has been committed to the source files for the next edition of the FS document. Consequently, I'm marking this issue CLOSED.