IncludeRB
From HTML WG Wiki
Change Proposal for ISSUE-172: Include the rb element
Leif Halvard Silli 03:12, 23 January 2012 (UTC)
Summary
This proposal requests:
- that the rb element, which is currently considered obsolete, is included — for the following reasons:
- legacy documents:
rb
inclusion makes legacy mark-up (in the wild and per XHTML 1.1 whererb
is is mandatory) HTML5-compatible. - legacy tools:
rb
inclusion allows tools to operate with largely identical ruby modes for XHTML 1.1 simple ruby and (simple) HTML5 ruby (instead of operating two modes that differ greatly) and, as well, allows authors to use XHTML 1.1 tools to create HTML5-compatible ruby content. - native wrapper: that
rb
only belongs in the ruby format, allows tools to auto-insert it together with the parentruby
element, something which is often practical compared with manually adding a wrapper as an afterthought. For contrast, the current HTML5 alternative will never offer the same feasibility, since:- the alternative wrapper (e.g.
span
) is not part of the format, and thus one would have to manually add it (for a contrast, when the editor oXygen inserts the ruby element in an XHTML 1.1. document, then therb
element gets automatically added so that the author can start to type inside it, much like many editors, when inserting adl
or atable
, also inserts the required children elements); - when deleting (with a browser based WYSIWYG tool) the content of a non-native wrapper, the wrapper itself will often be deleted too (this in order to prevent littering the code with stray, empty elements — XStandard does this, see the heading “Emtpy Tags”), whereas an empty element that is part of the format itself, could just remain in the code, ready to be refilled with content.)
- the alternative wrapper (e.g.
- CSS2 selector: the
rb
element makes a cross browser CSS2 selector (rb{}
) that, as well — and unlike e.g.span{}
— only selects ruby base text. In order to offer a similar feature without inclusion ofrb
, HTML5's editor has proposed extending the old and largely unimplemented CSS3 Generated and Replaced Content Module with, quote “a pseudo-element that can style certain spans of descendants); the flip side of::outside
. His proposed “flip side of ::outside” does, however (according to CanIuse.com) have zero implementation. And that the editor as well questions the need to select the ruby base text at all, doesn't make this option any more credible. Note: For simple ruby, it is not uncommon (examples: one, two (two a)) to useruby{display:inline-table}
in combinatino withrb{display:table-row-group; /* or similar, from CSS tables */}
. And it might be that removing therb
makes the styling less robust, but other than that, it seems to work also withoutrb
- CSS2 backup styling: HTML5 prescribes the style rules
ruby{display:ruby;}rt{display:ruby-text;}
, which stem from the CSS3 ruby module, but which none of the common browsers fully support yet (no, not even Webkit or IE, despite that they have some ruby support). Thus, if one wants it to look ruby in Opera and Firefox, then one must hack up som backup CSS that works, and a wrapper element for the ruby base text may then come in handy — even if it is not impossible to make it work without it: demo of styling that works in Opera + Firefox. - metadata readiness: A wrapper around the ruby base words makes the ruby base text ready to be independtly language tagged (
<rb lang="*">
), to become ARIA “styled” (e.g.<rb aria-hidden="true">
) and to receive a HTML class-name (<rb class="important">
) without affecting whether<ruby>
,<rt>
or<rp>
.
- legacy documents:
- that the content model of
ruby
is changed to make it non-conforming to let<rt>
occur adjacent to<rb>
more than once perruby
element — for the following reasons- source order is important:
- The HTML5 content model breaks with the content model of XHTML1.1. by requiring that one do
<ruby><rb>W</rb><rt>World</rt><rb>W</rb><rt>Wide</rt><rb>W</rb><rt>Web</rt></ruby>
- By contrast, XHTML 1.1 requires that one do the following, which as one can see, allows words and letters to be written in source order:
<ruby><rbc><rb>W</rb><rb>W</rb><rb>W</rb></rbc><rtc><rt>World</rt><rt>Wide</rt><rt>Web</rt></rtc></ruby>
- HTML5's break from the source order found in XHTML 1.1, creates problems for every parser/reader that needs to detect words more that visually.
Problem examples: A user trying a find-in-page search in the browser for 'WWW' when the above code is used, will not locate the 'WWW' that the user can see. For the same reason, it creates problems in screen readers, in online translations services like Google Translate, in copy-and-paste and selections - and so on and so forth. It is even hard to author, since the user cannot type the letters/words that belong together in one chunck — the author has instead type one ruby base letter and then a ruby text letter/word etc, which is cumbersome and prone to error — it is comparable to a table model where the author has to work with cells on two rows simultaneously. - The content model of this change proposal, requires that the above example be written like this:
-
<ruby><rb>W</rb><rb>W</rb><rb>W</rb><rt>World</rt><rt>Wide</rt><rt>Web</rt></ruby>
or this (NOTE:rtc
is not made conforming, as of yet, due to legacy parser problems): -
<ruby><rbc><rb>W</rb><rb>W</rb><rb>W</rb></rbc><rt>World</rt><rt>Wide</rt><rt>Web</rt></ruby>
.
-
- The HTML5 content model breaks with the content model of XHTML1.1. by requiring that one do
- source order is important:
- that the
rbc
element is included as a ruby base wrapper, for the following reaasons:- The rbc is very helpful when styling a ruby elemnet — it becomes almost impossible to style ruby cross browser unless it is included. And yet, it is made optional, to allow authors to “type less”
- that the HTML5 parser is updated to handle
rtc
, for the following reasons:- currently, Gecko and Trident will auto-close any element as soon as the parser sees
rp
orrt
. This prevents the introduction ofrtc
. - without
rtc
, double sided ruby is more or less impossible to achieve.
- currently, Gecko and Trident will auto-close any element as soon as the parser sees
Rationale
- The inclusion of
rb
addresses the problem that the alternative — exclusion ofrb
— encourages ad-hoc solutions with regard to marking up or styling the ruby base text. - The inclusion or
rb
allows thought-free, direct transition to/from e.g. XHTML 1.1. and HTML5. (Thought free = automatic: The author — or the authoring tool — does not have to ponder about which element to use.) - A dedicated wrapper element offers thought-free simplicity with regard to adding CSS, ARIA, a language tag or semantic meta data to the ruby base word — and without effects on
<ruby>
,<rt>
or<rp>
. - The change of the content model, addresses the need words/compounds to appear as words/compounds also in the source, in order to be compatible non-visual parsing (screen readers, find-in-page, translation services etc), which over all needs to work with text that is just as logical in the source as in the display.
- The inclusion of
rbc
allows advanced ruby to be styled more simply. - Preparing the HTML5 parser to handle
rtc
, allows thertc
elemetn to be introduced in HTML6, and thereby allowing double sided ruby.
Details
- A set of edit instructions:
- Inside “Text-level semantics”: Add a new section about the
rb
element. - Inside “Text-level semantics”: Add a new section about the
rbc
element. - Inside “Rendering”: Add a new note about the default CSS styling of rb (
rb{display:ruby-base}
) - Inside “Rendering”: Add a new note about the default CSS styling of rbc (probably
rb{display:ruby-base-container}
) - Inside “Obsolete features, delete the
rb
element. - About the HTML5 parser:
- With the exepction of the
ruby
element itself, and thertc
element, let the parser auto-close the current element (be itrb
,rt
,span
or whatever), when the parser sees arp
or art
element. This is almost what Geck and Webkit currently do, with the exception that they also auto-closesrtc
.
- With the exepction of the
- As authoring requirements:
- Say that
rb
SHOULD be manually closed by the author, in order to accommodate legacy UAs (that do not auto-close it). - Say that the
rb
element is optional but RECOMMENDED (for styling, ARIA, language-tagging, semantic-web purposes (RDFa and Microdata)).
- Say that
Impact
Positive Effects
- Offers simple transition from existing (simple) ruby mark-up to HTML5. E.g. simple to define cross-language 'microformats' that includes ruby if HTML5 and the the other language includes the same elements. And simple to make tools - and build on existing tools - that work in HTML5 as well as XHTML1.1.
- Offers simple, thought-free mark-up of ruby base text: Simple to add Language tagging, ARIA attributes, RDFa/Microdata attributes etc of the ruby base text.
- Simple, direct styling of ruby base text
- Future readiness (for advanced ruby): The
rb
is necessary in advanced ruby. - Most authors and authoring tools will continue to use
rb
- no need to learn something new. - Instead of forbidding
rb
, with difficult to explain reasons as justification and yet with effects on authors and authoring tools, allowingrb
instead benefit those authors, those parsers and those authoring tools that already use/implement it, thereby avoiding to needlessly bother them. - Authors don't have to wait for new CSS features to be invented (and deployed) before they get a CSS selector that is dedicated to styling ruby base text — and only ruby base text!
- Change of content model assures that text is meaningful both in source and in display, and thus simplifying treatment of ruby in AT, translation services, find-in-page features, spell-checkers etc
- Simpler styling and more efficient meta data tagging or ruby base, due to inclusion of
rbc
(tag one element instead of all therb
elements) - Preparedness for the future, due to the change in the HTML5 parser so that it doesn't auto-close the
rtc
element.
Negative Effects
-
rb
is not well supported in all legacy HTML parsers (the Trident parser), hence authors have to be aware of the need to use helping scripts (Modernizr, HTML5shiv etc) and helping CSS in order to get good styling.- Counter argument: For when this causes problems, authors have the option of either dropping the
rb
(if it helps) and/or adding an additional wrapper, such asspan
- Counter argument: If — as in legacy IE versions — the
rb
renders as an empty element, then no harm happens. Effectively, it means that the ruby base text is rendered without a wrapper - equivalent to dropping therb
. - Counter argument: Why would the lacking support in legacy HTML parsers be any more important to consider when it comes to
rb
compared to other, new HTML elements? - Counter argument: Since there actually is some support of
rb
in legacy UAs (at least, it is treated asspan
), it could be argued that it is simpler to includerb
compared to the inclusion of many completely new HTML elements in HTML5. - IE since IE9 supports
rb
natively.
- Counter argument: For when this causes problems, authors have the option of either dropping the
- Due to little visual feedback when
rb
is supported in contrast to when it is not supported, authors may think thatrb
works, while it in reality doesn't.- Counter argument: This can be said about legacy parsers with regard to many new element. And since it hasn't been held against the any other, new element in HTML5, it seems unjustified to hold it against
rb
.
- Counter argument: This can be said about legacy parsers with regard to many new element. And since it hasn't been held against the any other, new element in HTML5, it seems unjustified to hold it against
- Some pages that are authored accorind to the current content model, will become invalid due to this CP's requrement that no more than a single adjacent pair of
rb
andrt
occur in the sameruby
element.- The benefits of fixing the page is more important thant this slight annoyance.
- Change of content model affects how one browser (Webkit) with partial support for
ruby
display the ruby- This is true. However, the source order is so important that it is worth it.
- Use of the
rbc
element affects how one browser (Webkit) with partial support forruby
display the ruby- Counter argument: This is not true. It is the change of the content model that can cause this. The inclusion
rbc
instead allows authors to style the ruby so that it looks fine also inwebkit
. - Counter argument: Actually, in Trident, the <rbc> does no harm.
- Counter argument: This is not true. It is the change of the content model that can cause this. The inclusion
Conformance Classes Changes
- It becomes conforming to to use the
rb
element insideruby
- The HTML5 parser must auto-close the
rb
element — and any ather element exceptruby
andrtc
— when it seesrt
orrp
- Until UAs offers broad support auto-closing, it becomes RECOMMENDED to manually close the
rb
element. - The
rbc
element becomes valid - Only a single adjacent pair of
rb
andrt
is conforming
Risks
- If the author fails to manually close the
rb
element, then legacy UAs may place the rest of the ruby content inside therb
element, causing the mark-up to malfunction in legacy parsers.- Counter argument: In existing usage, the
rb
seems to almost always be closed. - Counter argument: This is a temporal problem - UAs willl update. (Currently released versions of Gecko and Webkit plus IE10 do it.)
- Counter argument: In existing usage, the
- Authors who do close the
rb
element, might think that closing therb
will guarantee that it works in legacy UAs.- Counter argument: Why? It is already well known that legacy versions of e.g. Firefox and IE do not handle unknown elements well, and that one must just various tricks in order to make new HTML5 elements work in legacy UAs.
- If authors are required to use
span
instead, then they know thatspan
does not get auto-closed, whereas forrb
, they have to learn that while it is intended to auto-close, it does so far not get auto-closed.- Counter argument: Actually, this is not true since, in Gecko, Webkit and IE10, then any element ges auto-clsoed when it see
rp
orrt
- Counter argument: To the degree that it is true (see above), it is a temporal problem - UAs willl update. Also: Because pre-HTML5 ruby is part of XHTML 1.1, the big bulk of legacy code do close the
rb
element, so it does not seem much of problem. This CP does however suggest that authors SHOULD close therb
until UAs catch up.
- Counter argument: Actually, this is not true since, in Gecko, Webkit and IE10, then any element ges auto-clsoed when it see
- To not make the
rb
element obligatory (like in XHTML 1.1), creates the risk that authors omits therb
element, just because they can, and because they think there is a benefit in doing so.- Comment: Agreed. I lean towards making the
rb
element obligatory, as I don't see it as particulary healthy that one can ommit therb
element. From my perspective, allowing therb
to be omitted, is just a compromise position (and I don't rule out that my argumentation in favour of includsionrb
would have been more successful if I asked it to be obligatory). There is primarily only one benefit, namely, that it fits slightly better with the fact that IE6-8 does not by default recognize this element. But this does not seem to be much of an argument since, in a HTML5 parser, any unknown element would be handled in a defined way. Thus, while<rb>word<rb>
might fail to work in an un-prepped copy of IE6/IE7/IE8 (and may be in Firefox 2), it would still nevertheless work in an HTML5 parser (as well as in e.g. IE6/7/8 browser prepped with an HTML5 helper script. - Comment: Indeed. There would indeed be no more benefit in omitting the
rb
element, than it would be for the same author in dropping thehtml
,head
orbody
element. E.g. if the author drops thehtml
element, then he/she also drops adding a language tag, semantic meta data and so on for the entire document. Actually, when droppinghtml
, then the element is still auto-generated by the HTML5 parser — which means that it is readily available for scripting and styling. In contrast, when dropping therb
element, then there is no automatic generation of the element. Which means that when an author omits of therb
element, then it he also takes away the direct opportunity to apply e.g. CSS to the element. (Fortunately, however, by including therb
in HTML5, the author (or the authoring tool) has a very simple recipe for fixing such situations, though.)
- Comment: Agreed. I lean towards making the
References
Relevant tests
rb versus span
- Richard Ishida's test of
rb
versus the alternatives, demonstrates that:- for (more or less) HTML5-capable parsers (Firefox 9 and Opera 11.5, then use of
span
does not work any better than usingrb
: In either case, the author, if he/she wants to use ruby, must use alternative CSS styling, due to the lack of support for the Ruby CSS module. - for UAs that have some kind of built-in ruby support (IE, Safari/Chrome, ) then
span
andrb
works equally well. - if one adds the HTML5 shiv —
<script>document.createElement("rb")</script>
to Richard's tests (see demo), thenrb
works as good/bad asspan
in legacy IE version.
- for (more or less) HTML5-capable parsers (Firefox 9 and Opera 11.5, then use of
auto-closing of the rb element
- Test: auto-closing test.
- Results:
- Parser does not recognize
<rb>
as an element (even if it recognizes<ruby>
and<rt>
): IE5-8.
In IE5-8, this has two effects: a)rb
styling does not work; b) it might make it seem as if auto-closing does work;. - Parser auto-closes any element (except
ruby
itself) when it sees<rt>
or<rp>
: Firefox, Webkit - No auto-closing happens: Opera, IE9 and IE5-8 with HTML5 shiv (
<script>document.createElement("rb");</script>
)
- Parser does not recognize
Use cases for rb
- Uses cases for
rb
amount to documenting that there are use cases for the addition of language tags, CSS, metadata (via Microformats, Microdata or RDFa), ARIA to ruby base text. In our view, it does not make sense to accept that it has to be documented — via use cases — that ruby base text needs langauge tagging, styling, metadata or ARIA, since these are features which are generally accepted as needed anywhere on any element. Nevertheless, we will mention some such examples:- Language tagging:
- The very idea behind ruby markup is to express a translation (in the widest sense of the word) of a ruby base text in the form of a ruby (annotation) text. Often the difference between base and text is only a difference in script - thus the language is the same while the script differs. Other times, the language differs. In either case, the difference in script or language, can be expressed via language tags on either
rb
orrt
— or on both. We can conclude that ruby mark-up (to the degree that language tagging has any relevance at all) is more frequently needed for ruby mark-up than for any other HTML construct. - The language is inherited from a parent element — e.g. from
p
orhtml
. And since the ruby text (rt
) is supposed to explain the ruby base (rb
), one must conclude that it is the ruby base that most frequently will need to be language tagged. (Thert
will just inherit the language tagging value from the parent element.) Without therb
element, one would have to first add a language tag on theruby
element, and then add a language tag on thert
element, in order to cancel the (inherited) effect ot setting the language on theruby
element — quite ad hoc and impractical, in our view.
- The very idea behind ruby markup is to express a translation (in the widest sense of the word) of a ruby base text in the form of a ruby (annotation) text. Often the difference between base and text is only a difference in script - thus the language is the same while the script differs. Other times, the language differs. In either case, the difference in script or language, can be expressed via language tags on either
- ARIA, CSS
- In the WebAIM forum, it was recently asked how to convey to an AT user that a Roman numeral (such as IV) stood for 4 and was not to be read as the letters I plus V. I provided an answer where one of the options was to use ruby markp. With the help of
<rb aria-hidden="true">
, this was the only method that worked even in the Mac OS X screen reader (VoiceOver). In the demo code, I also added styling of therb
element in the form ofrb{text-decoration:underline}
, to hint to sighted users as well that hovering above this text, would provide information. Thus I was able to, by default, hide the ruby text for everyone except screen readers uers — see demo.
- In the WebAIM forum, it was recently asked how to convey to an AT user that a Roman numeral (such as IV) stood for 4 and was not to be read as the letters I plus V. I provided an answer where one of the options was to use ruby markp. With the help of
- CSS selectors for ruby base
- Koji Ishii suggests treating
rb
just liketbody
— thus, the selector should work, even if the author skips actually typing it. A good idea. But this proposal does however not make that proposal, as it would mean that one woudl have to change the HTML5 parser so that it autogenerates the element. No such change are on the horizon. Alternatively, one could add a pseudo selector in CSS - e.g. ruby:base{}. But no such selector is on the horizon either. It stands that it is necessary to be able to select the ruby base, and that simples way is to userb
.
- Koji Ishii suggests treating
- metadata (Microformats, RDFa, Microdata)
- No examples.
- Language tagging: