Bugzilla – Bug 16311
[Ser30] newlines are an integral part of text mode serialization too
Last modified: 2013-01-19 03:22:34 UTC
In XML serialization, we can produce
For TEXT mode serialization, it should be just as easy to get
Mainly what I'm trying to say is in the world of TEXT mode, please don't just
offer an endless stream of items separated by SPACES. Instead remember that
TEXT mode is two dimensional: newlines and spaces!
Don't force the user to hardcode raw newlines in just to get them back out.
I've looked up the XQuery Serialization Spec.: Step 3 in Section 2, "Sequence Normalization", contains the sub-sentence "separated by a single space", which would probably have to be extended to take an additional serialization parameter into account. An XQuery 3.0 expression that uses the parameter could e.g. look as follows:
declare option output:separator " ";
(1 to 10)
Opinions are welcome.
All I know is I see a genuine raw ASCII character number 10
reference staring me in the face above.
Please add some level of indirection so it looks more elegant.
I recall the HTTP spec mentions what 'newlines' should be.
Use that if there is no XML, HTML etc. spec for text newlines.
Isn't there some way to detect it automatically for the operating
system, so the user need only change it if detected wrong?
OK, so make CR LF the default, and OK then I will use your above
reference if I just want UNIX LFs. But at least make a default.
Anyway, just make sure that
output can become
with just the change of a Xquery header declaration, and _no_ other rewriting of the
With my above example, we see Jidanni (that's me) doesn't want _every_
space to become a newline...
I have some sympathy with this suggestion. I would be inclined to handle it by overloading the interpretation of indent="yes" rather than adding a new serialization parameter: simply specify that if indent="yes" is specified then (regardless of which serialization method is used) the single space character used as a separator during sequence normalization MAY be replaced by some other whitespace sequence.
Good point; if possible, I'd like to stick with the existing parameters as well. Would it be possible to extend the choice of "indent" parameters with additional values (e.g. "space", "tab", "newline") without introducing legacy issues?
If better not, how can a user select a different a different whitespace sequence without being restricted to a specific implementation?
Wait, I can just use CSV (Comma Separated Values) output mode, setting quotes and commas to nil first!
...Alas, CSV is just and input mode not an output mode in my favorite Xquery implementation :-(
I like Michael Kay's suggestion in comment 3 of overloading the meaning of indent="yes" to satisfy this requirement. However, I think it might be better to make it stronger than a "MAY" - for the text output method, the formatting seems much more important than for XML.
One concern I have lies with the interaction with mixed content. I think we would want to restrict the insertion of new line characters (or whatever whitespace was desired) in the same way that the serialization draft does for the XML output method.
e.g., The following
<p>This is my <b>first</b> paragraph.</p><p>This is my <i>second</i> paragraph.</p>
should probably be serialized as
This is my first paragraph.
This is my second paragraph.
and not as
This is my
This is my
Jidanni, I still feel some confusion about the requirement you've described. Are you interested only in new lines between items in the sequence that is to be serialized, or new lines in other places as well? In your examples, you have a sequence of two elements. What if you were serializing a document node that contained two elements, as in the result of this query?
Do you still want to see this result serialized?
I am sort of getting hazy, but why would I want my results glued together :-)
Anyway why don't you add a additional controlling parameter, so one can pick
1. Results glued together
2. Results separated by spaces
3. Results separated by newlines
And be sure to allow the user to switch back and forth anywhere in the program, don't lock him into only one mode for the whole file.
Jidanni, sorry for the late update on this issue. At the joint teleconference of the XSLT and XQuery working groups of 25 September 2012, the working groups decided to resolve your request through the addition of a new "item-separator" serialization parameter.
The item-separator is a string, and if present, each item in the serialized result is separated by its value. The serialization parameter affects all output methods, not just the text output method.
So, if the sequence to be serialized was <X>1</X>,<Y>2</Y>, and the value of the item-separator was the LINE FEED character, the serialized result for the XML output method would be
and for the TEXT output method would be
Note that the value of the parameter is only inserted between items in the original sequence that is to be serialized. So if the value of the item-separator was LINE FEED, and the sequence to be serialized under the TEXT output method was
<p>My <em>first</em> paragraph.</p><p>My <em>second</em> paragraph.</p>
then the serialized result would be
My first paragraph.
My second paragraph.
You can see the detailed description in the new working draft of Serialization 3.0.
 https://lists.w3.org/Archives/Member/w3c-xsl-query/2012Sep/0094.html (Member-only link)
Thanks. I hope that is what I wanted but well time slips so I am foggy now :-)