Re: Line separator and Paragraph separator in HTML 5

On Mon, 18 Jan 2010, Kent Karlsson wrote:

> So I don't think one should blindly reuse this bidi category for other
> purposes. For HTML5's purposes, I think TAB, VT, LF, CR, NEL, and PS
> should also be considered to be "white space"; i.e. a slightly more
> general sense than the bidi category White_Space/WS. Further, in addition
> to LF and CR, also VT, FF, NEL, LS, and PS should be considered line
> break characters.
>
> I don't see much logic in having both "[HTML5]space" and "White_Space"
> in HTML5. A single set (as described above) would suffice it seems to me...
> (out of which a subset are also line break characters, as above).


"[HTML5]space" has a very clear meaning: these are characters used for 
HTML source formatting; any sequence of "[HTML5]space" is equivalent to 
a single space. Surely "[HTML5]space" should include TAB, VT, LF, FF, 
CR, Space, NEL, LS and PS.

As for the "[HTML5]White_Space" category, its purpose is really unclear. 
The rendering of characters in this category (those that are not 
included in "[HTML5]space") should be defined in the Unicode standard, 
not in the HTML standard.

Received on Monday, 18 January 2010 12:57:55 UTC