29608 2016-05-04 05:15:13 +0000 Fragment identifiers: fn:doc vs. fn:unparsed-text 2016-07-21 15:03:00 +0000 1 1 1 Unclassified XPath / XQuery / XSLT Functions and Operators 3.1 Candidate Recommendation PC Windows NT CLOSED FIXED P2 normal --- 1 christian.gruen mike abel.braaksma public-qt-comments oldest_to_newest 126327 0 christian.gruen 2016-05-04 05:15:13 +0000 I noticed that fragment identifiers are only allowed for fn:doc, but not for fn:unparsed-text. I assume there is a specific reason for that, but I could not find any hint in the XQFO specification. Would it be possible to add an explanatory phrase to one of the function definitions? 126328 1 mike 2016-05-04 10:49:51 +0000 The semantics of a fragment identifier depend on the media type. There is a well-defined interpretation of fragment identifiers for application/xml, but I'm not aware of any for text/plain. Am I mistaken? 126330 2 christian.gruen 2016-05-04 11:03:52 +0000 > Am I mistaken? I don’t think so. It was mostly my knowledge gap (and my vague assumption that others could have similar gaps) that led to this question. With the well-defined interpretation, do you refer to the XPointer working draft? 126376 3 abel.braaksma 2016-05-08 19:46:09 +0000 The fragment identifier is defined with the MIME type of a document and may depend on the serving application. W3C says this: (https://www.w3.org/DesignIssues/Fragment.html) "The significance of the fragment identifier is a function of the MIME type of the object.[...]The fragment ID spec for a new MIME type should be part of the MIME type registration process." And RFC-3986 says: "The fragment's format and resolution is [..] dependent on the media type [RFC2046] of a potentially retrieved representation. [..] Fragment identifier semantics are independent of the URI scheme and thus cannot be redefined by scheme specifications." Here are some examples of such MIME types: - RFC-5147 defines fragment ids for text/plain through "char" and "line" - RFC-7111 defines it for text/csv for "col" and "row" - RFC-3778 defines it for application/pdf (page no, section, ref etc) - W3C has a Recommendation for Media Fragments: https://www.w3.org/TR/media-frags/ - Some languages support package download (like Python) and add the MD5 hash as a fragment identifier to the URI, which then either retrieves the whole document or an error (if the MD5 does not match). This could well be used with any resource. - This same MD5 check, or a length-check is also part of RFC-5147. So I think Christian has a good point and perhaps we should say something about it, and at the very least allow it with unparsed-text etc as well (in fact, I think we should allow it with any external resource). 126407 4 mike 2016-05-10 15:43:43 +0000 Noted in discussion: (a) this is a new feature to introduce at a late stage and because of the interaction with the environment there's a lot of scope for quibbling about the exact spec; the chance of getting it right first time is small. (b) doc() currently allows a fragment id and defines no semantics for it. Almost any attempt to define semantics for it would cause a compatibility problem for some implementations (or for their users...) (c) unparsed-text() currently disallows a fragment id. If we were to allow it, we would probably have to leave the semantics implementation-defined if we want similar behaviour to doc(). It's not clear that's desirable. 126408 5 mike 2016-05-10 15:56:03 +0000 We resolved to make no change to what's allowed/disallowed but to add editorial notes explaining the effect of fragment identifiers in the doc() function. 126694 6 mike 2016-06-07 15:02:14 +0000 fn:doc has a list of things where the behaviour of the function is implementation-defined, and I have added to this list: <p diff="add" at="E">The effect of a fragment identifier in the supplied URI is implementation-defined. One possible interpretation is to treat the fragment identifier as an ID attribute value, and to return a document node having the element with the selected ID value as its only child.</p>