Warning:
This wiki has been archived and is now read-only.
IncludeRB
From HTML WG Wiki
Change Proposal for ISSUE-172: Include the rb element
Leif Halvard Silli 03:12, 23 January 2012 (UTC)
Summary
This proposal requests:
- that the rb element, which is currently considered obsolete, is included — for the following reasons:
- legacy documents:
rbinclusion makes legacy mark-up (in the wild and per XHTML 1.1 whererbis is mandatory) HTML5-compatible. - legacy tools:
rbinclusion allows tools to operate with largely identical ruby modes for XHTML 1.1 simple ruby and (simple) HTML5 ruby (instead of operating two modes that differ greatly) and, as well, allows authors to use XHTML 1.1 tools to create HTML5-compatible ruby content. - native wrapper: that
rbonly belongs in the ruby format, allows tools to auto-insert it together with the parentrubyelement, something which is often practical compared with manually adding a wrapper as an afterthought. For contrast, the current HTML5 alternative will never offer the same feasibility, since:- the alternative wrapper (e.g.
span) is not part of the format, and thus one would have to manually add it (for a contrast, when the editor oXygen inserts the ruby element in an XHTML 1.1. document, then therbelement gets automatically added so that the author can start to type inside it, much like many editors, when inserting adlor atable, also inserts the required children elements); - when deleting (with a browser based WYSIWYG tool) the content of a non-native wrapper, the wrapper itself will often be deleted too (this in order to prevent littering the code with stray, empty elements — XStandard does this, see the heading “Emtpy Tags”), whereas an empty element that is part of the format itself, could just remain in the code, ready to be refilled with content.)
- the alternative wrapper (e.g.
- CSS2 selector: the
rbelement makes a cross browser CSS2 selector (rb{}) that, as well — and unlike e.g.span{}— only selects ruby base text. In order to offer a similar feature without inclusion ofrb, HTML5's editor has proposed extending the old and largely unimplemented CSS3 Generated and Replaced Content Module with, quote “a pseudo-element that can style certain spans of descendants); the flip side of::outside. His proposed “flip side of ::outside” does, however (according to CanIuse.com) have zero implementation. And that the editor as well questions the need to select the ruby base text at all, doesn't make this option any more credible. Note: For simple ruby, it is not uncommon (examples: one, two (two a)) to useruby{display:inline-table}in combinatino withrb{display:table-row-group; /* or similar, from CSS tables */}. And it might be that removing therbmakes the styling less robust, but other than that, it seems to work also withoutrb - CSS2 backup styling: HTML5 prescribes the style rules
ruby{display:ruby;}rt{display:ruby-text;}, which stem from the CSS3 ruby module, but which none of the common browsers fully support yet (no, not even Webkit or IE, despite that they have some ruby support). Thus, if one wants it to look ruby in Opera and Firefox, then one must hack up som backup CSS that works, and a wrapper element for the ruby base text may then come in handy — even if it is not impossible to make it work without it: demo of styling that works in Opera + Firefox. - metadata readiness: A wrapper around the ruby base words makes the ruby base text ready to be independtly language tagged (
<rb lang="*">), to become ARIA “styled” (e.g.<rb aria-hidden="true">) and to receive a HTML class-name (<rb class="important">) without affecting whether<ruby>,<rt>or<rp>.
- legacy documents:
- that the content model of
rubyis changed to make it non-conforming to let<rt>occur adjacent to<rb>more than once perrubyelement — for the following reasons- source order is important:
- The HTML5 content model breaks with the content model of XHTML1.1. by requiring that one do
<ruby><rb>W</rb><rt>World</rt><rb>W</rb><rt>Wide</rt><rb>W</rb><rt>Web</rt></ruby>
- By contrast, XHTML 1.1 requires that one do the following, which as one can see, allows words and letters to be written in source order:
<ruby><rbc><rb>W</rb><rb>W</rb><rb>W</rb></rbc><rtc><rt>World</rt><rt>Wide</rt><rt>Web</rt></rtc></ruby>
- HTML5's break from the source order found in XHTML 1.1, creates problems for every parser/reader that needs to detect words more that visually.
Problem examples: A user trying a find-in-page search in the browser for 'WWW' when the above code is used, will not locate the 'WWW' that the user can see. For the same reason, it creates problems in screen readers, in online translations services like Google Translate, in copy-and-paste and selections - and so on and so forth. It is even hard to author, since the user cannot type the letters/words that belong together in one chunck — the author has instead type one ruby base letter and then a ruby text letter/word etc, which is cumbersome and prone to error — it is comparable to a table model where the author has to work with cells on two rows simultaneously. - The content model of this change proposal, requires that the above example be written like this:
-
<ruby><rb>W</rb><rb>W</rb><rb>W</rb><rt>World</rt><rt>Wide</rt><rt>Web</rt></ruby>
or this (NOTE:rtcis not made conforming, as of yet, due to legacy parser problems): -
<ruby><rbc><rb>W</rb><rb>W</rb><rb>W</rb></rbc><rt>World</rt><rt>Wide</rt><rt>Web</rt></ruby>.
-
- The HTML5 content model breaks with the content model of XHTML1.1. by requiring that one do
- source order is important:
- that the
rbcelement is included as a ruby base wrapper, for the following reaasons:- The rbc is very helpful when styling a ruby elemnet — it becomes almost impossible to style ruby cross browser unless it is included. And yet, it is made optional, to allow authors to “type less”
- that the HTML5 parser is updated to handle
rtc, for the following reasons:- currently, Gecko and Trident will auto-close any element as soon as the parser sees
rporrt. This prevents the introduction ofrtc. - without
rtc, double sided ruby is more or less impossible to achieve.
- currently, Gecko and Trident will auto-close any element as soon as the parser sees
Rationale
- The inclusion of
rbaddresses the problem that the alternative — exclusion ofrb— encourages ad-hoc solutions with regard to marking up or styling the ruby base text. - The inclusion or
rballows thought-free, direct transition to/from e.g. XHTML 1.1. and HTML5. (Thought free = automatic: The author — or the authoring tool — does not have to ponder about which element to use.) - A dedicated wrapper element offers thought-free simplicity with regard to adding CSS, ARIA, a language tag or semantic meta data to the ruby base word — and without effects on
<ruby>,<rt>or<rp>. - The change of the content model, addresses the need words/compounds to appear as words/compounds also in the source, in order to be compatible non-visual parsing (screen readers, find-in-page, translation services etc), which over all needs to work with text that is just as logical in the source as in the display.
- The inclusion of
rbcallows advanced ruby to be styled more simply. - Preparing the HTML5 parser to handle
rtc, allows thertcelemetn to be introduced in HTML6, and thereby allowing double sided ruby.
Details
- A set of edit instructions:
- Inside “Text-level semantics”: Add a new section about the
rbelement. - Inside “Text-level semantics”: Add a new section about the
rbcelement. - Inside “Rendering”: Add a new note about the default CSS styling of rb (
rb{display:ruby-base}) - Inside “Rendering”: Add a new note about the default CSS styling of rbc (probably
rb{display:ruby-base-container}) - Inside “Obsolete features, delete the
rbelement. - About the HTML5 parser:
- With the exepction of the
rubyelement itself, and thertcelement, let the parser auto-close the current element (be itrb,rt,spanor whatever), when the parser sees arpor artelement. This is almost what Geck and Webkit currently do, with the exception that they also auto-closesrtc.
- With the exepction of the
- As authoring requirements:
- Say that
rbSHOULD be manually closed by the author, in order to accommodate legacy UAs (that do not auto-close it). - Say that the
rbelement is optional but RECOMMENDED (for styling, ARIA, language-tagging, semantic-web purposes (RDFa and Microdata)).
- Say that
Impact
Positive Effects
- Offers simple transition from existing (simple) ruby mark-up to HTML5. E.g. simple to define cross-language 'microformats' that includes ruby if HTML5 and the the other language includes the same elements. And simple to make tools - and build on existing tools - that work in HTML5 as well as XHTML1.1.
- Offers simple, thought-free mark-up of ruby base text: Simple to add Language tagging, ARIA attributes, RDFa/Microdata attributes etc of the ruby base text.
- Simple, direct styling of ruby base text
- Future readiness (for advanced ruby): The
rbis necessary in advanced ruby. - Most authors and authoring tools will continue to use
rb- no need to learn something new. - Instead of forbidding
rb, with difficult to explain reasons as justification and yet with effects on authors and authoring tools, allowingrbinstead benefit those authors, those parsers and those authoring tools that already use/implement it, thereby avoiding to needlessly bother them. - Authors don't have to wait for new CSS features to be invented (and deployed) before they get a CSS selector that is dedicated to styling ruby base text — and only ruby base text!
- Change of content model assures that text is meaningful both in source and in display, and thus simplifying treatment of ruby in AT, translation services, find-in-page features, spell-checkers etc
- Simpler styling and more efficient meta data tagging or ruby base, due to inclusion of
rbc(tag one element instead of all therbelements) - Preparedness for the future, due to the change in the HTML5 parser so that it doesn't auto-close the
rtcelement.
Negative Effects
-
rbis not well supported in all legacy HTML parsers (the Trident parser), hence authors have to be aware of the need to use helping scripts (Modernizr, HTML5shiv etc) and helping CSS in order to get good styling.- Counter argument: For when this causes problems, authors have the option of either dropping the
rb(if it helps) and/or adding an additional wrapper, such asspan - Counter argument: If — as in legacy IE versions — the
rbrenders as an empty element, then no harm happens. Effectively, it means that the ruby base text is rendered without a wrapper - equivalent to dropping therb. - Counter argument: Why would the lacking support in legacy HTML parsers be any more important to consider when it comes to
rbcompared to other, new HTML elements? - Counter argument: Since there actually is some support of
rbin legacy UAs (at least, it is treated asspan), it could be argued that it is simpler to includerbcompared to the inclusion of many completely new HTML elements in HTML5. - IE since IE9 supports
rbnatively.
- Counter argument: For when this causes problems, authors have the option of either dropping the
- Due to little visual feedback when
rbis supported in contrast to when it is not supported, authors may think thatrbworks, while it in reality doesn't.- Counter argument: This can be said about legacy parsers with regard to many new element. And since it hasn't been held against the any other, new element in HTML5, it seems unjustified to hold it against
rb.
- Counter argument: This can be said about legacy parsers with regard to many new element. And since it hasn't been held against the any other, new element in HTML5, it seems unjustified to hold it against
- Some pages that are authored accorind to the current content model, will become invalid due to this CP's requrement that no more than a single adjacent pair of
rbandrtoccur in the samerubyelement.- The benefits of fixing the page is more important thant this slight annoyance.
- Change of content model affects how one browser (Webkit) with partial support for
rubydisplay the ruby- This is true. However, the source order is so important that it is worth it.
- Use of the
rbcelement affects how one browser (Webkit) with partial support forrubydisplay the ruby- Counter argument: This is not true. It is the change of the content model that can cause this. The inclusion
rbcinstead allows authors to style the ruby so that it looks fine also inwebkit. - Counter argument: Actually, in Trident, the <rbc> does no harm.
- Counter argument: This is not true. It is the change of the content model that can cause this. The inclusion
Conformance Classes Changes
- It becomes conforming to to use the
rbelement insideruby - The HTML5 parser must auto-close the
rbelement — and any ather element exceptrubyandrtc— when it seesrtorrp - Until UAs offers broad support auto-closing, it becomes RECOMMENDED to manually close the
rbelement. - The
rbcelement becomes valid - Only a single adjacent pair of
rbandrtis conforming
Risks
- If the author fails to manually close the
rbelement, then legacy UAs may place the rest of the ruby content inside therbelement, causing the mark-up to malfunction in legacy parsers.- Counter argument: In existing usage, the
rbseems to almost always be closed. - Counter argument: This is a temporal problem - UAs willl update. (Currently released versions of Gecko and Webkit plus IE10 do it.)
- Counter argument: In existing usage, the
- Authors who do close the
rbelement, might think that closing therbwill guarantee that it works in legacy UAs.- Counter argument: Why? It is already well known that legacy versions of e.g. Firefox and IE do not handle unknown elements well, and that one must just various tricks in order to make new HTML5 elements work in legacy UAs.
- If authors are required to use
spaninstead, then they know thatspandoes not get auto-closed, whereas forrb, they have to learn that while it is intended to auto-close, it does so far not get auto-closed.- Counter argument: Actually, this is not true since, in Gecko, Webkit and IE10, then any element ges auto-clsoed when it see
rporrt - Counter argument: To the degree that it is true (see above), it is a temporal problem - UAs willl update. Also: Because pre-HTML5 ruby is part of XHTML 1.1, the big bulk of legacy code do close the
rbelement, so it does not seem much of problem. This CP does however suggest that authors SHOULD close therbuntil UAs catch up.
- Counter argument: Actually, this is not true since, in Gecko, Webkit and IE10, then any element ges auto-clsoed when it see
- To not make the
rbelement obligatory (like in XHTML 1.1), creates the risk that authors omits therbelement, just because they can, and because they think there is a benefit in doing so.- Comment: Agreed. I lean towards making the
rbelement obligatory, as I don't see it as particulary healthy that one can ommit therbelement. From my perspective, allowing therbto be omitted, is just a compromise position (and I don't rule out that my argumentation in favour of includsionrbwould have been more successful if I asked it to be obligatory). There is primarily only one benefit, namely, that it fits slightly better with the fact that IE6-8 does not by default recognize this element. But this does not seem to be much of an argument since, in a HTML5 parser, any unknown element would be handled in a defined way. Thus, while<rb>word<rb>might fail to work in an un-prepped copy of IE6/IE7/IE8 (and may be in Firefox 2), it would still nevertheless work in an HTML5 parser (as well as in e.g. IE6/7/8 browser prepped with an HTML5 helper script. - Comment: Indeed. There would indeed be no more benefit in omitting the
rbelement, than it would be for the same author in dropping thehtml,headorbodyelement. E.g. if the author drops thehtmlelement, then he/she also drops adding a language tag, semantic meta data and so on for the entire document. Actually, when droppinghtml, then the element is still auto-generated by the HTML5 parser — which means that it is readily available for scripting and styling. In contrast, when dropping therbelement, then there is no automatic generation of the element. Which means that when an author omits of therbelement, then it he also takes away the direct opportunity to apply e.g. CSS to the element. (Fortunately, however, by including therbin HTML5, the author (or the authoring tool) has a very simple recipe for fixing such situations, though.)
- Comment: Agreed. I lean towards making the
References
Relevant tests
rb versus span
- Richard Ishida's test of
rbversus the alternatives, demonstrates that:- for (more or less) HTML5-capable parsers (Firefox 9 and Opera 11.5, then use of
spandoes not work any better than usingrb: In either case, the author, if he/she wants to use ruby, must use alternative CSS styling, due to the lack of support for the Ruby CSS module. - for UAs that have some kind of built-in ruby support (IE, Safari/Chrome, ) then
spanandrbworks equally well. - if one adds the HTML5 shiv —
<script>document.createElement("rb")</script>to Richard's tests (see demo), thenrbworks as good/bad asspanin legacy IE version.
- for (more or less) HTML5-capable parsers (Firefox 9 and Opera 11.5, then use of
auto-closing of the rb element
- Test: auto-closing test.
- Results:
- Parser does not recognize
<rb>as an element (even if it recognizes<ruby>and<rt>): IE5-8.
In IE5-8, this has two effects: a)rbstyling does not work; b) it might make it seem as if auto-closing does work;. - Parser auto-closes any element (except
rubyitself) when it sees<rt>or<rp>: Firefox, Webkit - No auto-closing happens: Opera, IE9 and IE5-8 with HTML5 shiv (
<script>document.createElement("rb");</script>)
- Parser does not recognize
Use cases for rb
- Uses cases for
rbamount to documenting that there are use cases for the addition of language tags, CSS, metadata (via Microformats, Microdata or RDFa), ARIA to ruby base text. In our view, it does not make sense to accept that it has to be documented — via use cases — that ruby base text needs langauge tagging, styling, metadata or ARIA, since these are features which are generally accepted as needed anywhere on any element. Nevertheless, we will mention some such examples:- Language tagging:
- The very idea behind ruby markup is to express a translation (in the widest sense of the word) of a ruby base text in the form of a ruby (annotation) text. Often the difference between base and text is only a difference in script - thus the language is the same while the script differs. Other times, the language differs. In either case, the difference in script or language, can be expressed via language tags on either
rborrt— or on both. We can conclude that ruby mark-up (to the degree that language tagging has any relevance at all) is more frequently needed for ruby mark-up than for any other HTML construct. - The language is inherited from a parent element — e.g. from
porhtml. And since the ruby text (rt) is supposed to explain the ruby base (rb), one must conclude that it is the ruby base that most frequently will need to be language tagged. (Thertwill just inherit the language tagging value from the parent element.) Without therbelement, one would have to first add a language tag on therubyelement, and then add a language tag on thertelement, in order to cancel the (inherited) effect ot setting the language on therubyelement — quite ad hoc and impractical, in our view.
- The very idea behind ruby markup is to express a translation (in the widest sense of the word) of a ruby base text in the form of a ruby (annotation) text. Often the difference between base and text is only a difference in script - thus the language is the same while the script differs. Other times, the language differs. In either case, the difference in script or language, can be expressed via language tags on either
- ARIA, CSS
- In the WebAIM forum, it was recently asked how to convey to an AT user that a Roman numeral (such as IV) stood for 4 and was not to be read as the letters I plus V. I provided an answer where one of the options was to use ruby markp. With the help of
<rb aria-hidden="true">, this was the only method that worked even in the Mac OS X screen reader (VoiceOver). In the demo code, I also added styling of therbelement in the form ofrb{text-decoration:underline}, to hint to sighted users as well that hovering above this text, would provide information. Thus I was able to, by default, hide the ruby text for everyone except screen readers uers — see demo.
- In the WebAIM forum, it was recently asked how to convey to an AT user that a Roman numeral (such as IV) stood for 4 and was not to be read as the letters I plus V. I provided an answer where one of the options was to use ruby markp. With the help of
- CSS selectors for ruby base
- Koji Ishii suggests treating
rbjust liketbody— thus, the selector should work, even if the author skips actually typing it. A good idea. But this proposal does however not make that proposal, as it would mean that one woudl have to change the HTML5 parser so that it autogenerates the element. No such change are on the horizon. Alternatively, one could add a pseudo selector in CSS - e.g. ruby:base{}. But no such selector is on the horizon either. It stands that it is necessary to be able to select the ruby base, and that simples way is to userb.
- Koji Ishii suggests treating
- metadata (Microformats, RDFa, Microdata)
- No examples.
- Language tagging: