<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>20995</bug_id>
          
          <creation_ts>2013-02-14 12:45:39 +0000</creation_ts>
          <short_desc>[XPROC10] document-uri</short_desc>
          <delta_ts>2014-03-12 14:15:11 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>XML Processing Model</product>
          <component>Pipeline language</component>
          <version>unspecified</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Windows NT</op_sys>
          <bug_status>ASSIGNED</bug_status>
          <resolution></resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Tim Mills">tim</reporter>
          <assigned_to name="Norman Walsh">ndw</assigned_to>
          
          
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>83141</commentid>
    <comment_count>0</comment_count>
    <who name="Tim Mills">tim</who>
    <bug_when>2013-02-14 12:45:39 +0000</bug_when>
    <thetext>The XProc 1.0 specification fails to mention the value of the document-uri property for the results of any step, or for pipeline elements such as p:document.

I presume that for the pipeline element p:document, it should be the URI of the accessed document.

However, for other steps, and in particular the p:load step, I presume, the document-uri should be absent.  Otherwise, there is a risk that the XPath rule 
It would appear that this would permit the violation of XPath rule that &quot;given a document node $N, the result of fn:doc(fn:document-uri($N)) is $N will always be True&quot; would be violated, e.g. by performing two invocations of p:load specify the same URI, but with different values for dtd-validate.

It might be worth noting that the &apos;p:identity&apos; operation changes the document-uri and base-uri properties.  It&apos;s also not clear whether for a document $d and its &apos;identical&apos; copy $i, ($d is $i) must return false.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>84338</commentid>
    <comment_count>1</comment_count>
    <who name="Norman Walsh">ndw</who>
    <bug_when>2013-03-13 15:06:01 +0000</bug_when>
    <thetext>Hi Tim,

We&apos;re starting to take up the bugs :-)

Can you clarify what you mean in the last paragraph about p:identity changing the URIs. The WG doesn&apos;t believe that p:identity changes anything about the document.

And we also don&apos;t make the XQuery/XSLT guarantee about consistency of documents, because pipelines explicitly change them.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>84400</commentid>
    <comment_count>2</comment_count>
    <who name="Tim Mills">tim</who>
    <bug_when>2013-03-14 10:32:29 +0000</bug_when>
    <thetext>
&gt; We&apos;re starting to take up the bugs :-)

Great - thanks.  You may be relieved to know I&apos;m near the end of implementation, and so there shouldn&apos;t be too many more queries to come.

&gt; Can you clarify what you mean in the last paragraph about p:identity
&gt; changing the URIs. The WG doesn&apos;t believe that p:identity changes anything
&gt; about the document.

That came from a misunderstanding of the comment in test Test base-uri #002.

      &lt;!-- This p:identity step makes sure that we grab the root element --&gt;
      &lt;!-- where the xml:base exists. Otherwise, we get the base uri --&gt;
      &lt;!-- of the input document itself, and that varies by test env. --&gt;
      &lt;p:identity&gt;
	&lt;p:input port=&quot;source&quot; select=&quot;/doc&quot;/&gt;
      &lt;/p:identity&gt;

Of course, it&apos;s not the p:identity which is having an effect on the base URI, but rather the implicit creation of new document nodes resulting from the select=&quot;/doc&quot;.

&gt; And we also don&apos;t make the XQuery/XSLT guarantee about consistency of
&gt; documents, because pipelines explicitly change them.

Since the only changes visible from execution of an XProc pipeline result from a p:store or a result document from p:xslt (which set a potentially new document URI), it makes sense for new (intermediate) documents created by XProc steps (explicitly or implicitly through use of select) to have absent document URIs.

Since XProc uses XPath, it needs to guarantee that

&quot;... given a document node $N, the result of fn:doc(fn:document-uri($N)) is $N will always be True, unless fn:document-uri($N) is an empty sequence.&quot;

although XProc is at liberty to relax the guarantee of stability for documents access.

The example in 

http://www.w3.org/TR/xproc/#parallelism

shows that constructing the pipeline carefully so that the consequences of side-effects are evident to the processor can avoid much of the unpleasantness of side-effects on document stability.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>98083</commentid>
    <comment_count>3</comment_count>
    <who name="Norman Walsh">ndw</who>
    <bug_when>2014-01-07 15:38:39 +0000</bug_when>
    <thetext>Hi Tim,

Apologies for not driving the process of resolving 1.0 errata with more vigor. I&apos;ll try to do better in 2014.

In discussing this issue (http://www.w3.org/XML/XProc/2013/03/20-minutes#action02), we considered the possibility that we could look at the evaluation context for an XPath expression as being scoped to an individual step. That would seem to satisfy the XPath constraint, at least if some care is taken to cache documents for the duration of evaluating the expressions in a step invocation.

I wonder how that sits with you.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>98253</commentid>
    <comment_count>4</comment_count>
    <who name="Tim Mills">tim</who>
    <bug_when>2014-01-10 10:53:23 +0000</bug_when>
    <thetext>(In reply to Norman Walsh from comment #3)
&gt; Hi Tim,
&gt; 
&gt; Apologies for not driving the process of resolving 1.0 errata with more
&gt; vigor. I&apos;ll try to do better in 2014.

Thanks.  I know you&apos;re busy.

&gt; ...
&gt; I wonder how that sits with you.

To ensure that the XPath requirement is met, I think:

1.  fn:doc needs to be stable within the entire execution of a pipeline (not just a step), AND

2.  For a document sourced by means other than calls to fn:doc (or fn:collection), either
  a) the document node should have an empty document-uri, OR
  b) if the document node has a non-empty document-uri &apos;SOME-URI&apos;, then it MUST be stable, i.e. as if it had been accessed via a call to fn:doc(&apos;SOME-URI&apos;).

Otherwise, it is possible for two documents A and B to arrive as inputs to a step and have the same non-empty document-uri but be different, violating the XPath requirement.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>100866</commentid>
    <comment_count>5</comment_count>
    <who name="Norman Walsh">ndw</who>
    <bug_when>2014-02-19 10:15:07 +0000</bug_when>
    <thetext>Minutes from 19 Feb:

We&apos;re going to document this as an erratum in V1 but not try to fix it there.

In V2, we&apos;ll make doc() stable.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>100868</commentid>
    <comment_count>6</comment_count>
    <who name="Tim Mills">tim</who>
    <bug_when>2014-02-19 10:23:48 +0000</bug_when>
    <thetext>Thanks.

And will point (2)

2.  For a document sourced by means other than calls to fn:doc (or fn:collection), either
  a) the document node should have an empty document-uri, OR
  b) if the document node has a non-empty document-uri &apos;SOME-URI&apos;, then it MUST be stable, i.e. as if it had been accessed via a call to fn:doc(&apos;SOME-URI&apos;).

also hold in V2?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>102239</commentid>
    <comment_count>7</comment_count>
    <who name="Norman Walsh">ndw</who>
    <bug_when>2014-03-12 14:15:11 +0000</bug_when>
    <thetext>Yes, I think it&apos;s incumbent on us to get the XPath semantics right in V2.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>