Bug 21040 - Double-sided ruby
Double-sided ruby
Status: NEW
Product: HTML WG
Classification: Unclassified
Component: HTML5 spec
Other other
: P3 normal
: ---
Assigned To: Robin Berjon
HTML WG Bugzilla archive list
Depends on: 20115
  Show dependency treegraph
Reported: 2013-02-18 14:57 UTC by Robin Berjon
Modified: 2013-02-26 22:11 UTC (History)
8 users (show)

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description Robin Berjon 2013-02-18 14:57:03 UTC
+++ This bug was initially created as a clone of Bug #20115 +++

From: http://lists.w3.org/Archives/Public/public-whatwg-archive/2012Nov/0370.html

When you want to create double-sided ruby, with the ruby text on both
sides of base text, the current HTML model posits two separate and
fairly different markup models.

In the first, when the group boundaries for both ruby text runs are
the same, it allows you to have two <rt>s following an <rb>, with the
obvious meaning.

In the second, when the group boundaries do *not* line up (in
particular, for the common case where one line of ruby is
per-character and the other is for the whole group, such as with a
pinyin and English translation), it requires you to nest two <ruby>
elements, with the inner one supplying the per-character annotations
and the outer supplying the whole-group ones.

Having to learn and use two different markup patterns for two nearly
identical use-cases is sub-optimal for authors.  It would be best if
they could just learn one model that works for both.

On the implementation side, this also requires two different layout
models for essentially the exact same thing.  This is unnecessarily
complicated; again, one simple way to get both would be preferred.

This is easy to address.  Add an <rtc> element (name taken from the
XHTML Ruby module), which is used for the second line of text.  You
can fill an <rtc> with <rt> elements, in which case they match up
index-wise with the preceding run of <rb> elements.  The last <rt>
(or, if no <rt>s were given at all, the naked text that was implicitly
wrapped in an <rt>) automatically spans the remaining bases in the
preceding run.

This makes both cases trivial.  If both runs of ruby are
per-character, you can just write:


Or, in the pure column-based model:


Alternately, if the second line of ruby text spans the entire group,
that's also trivial, and very simlar:

<ruby><rb>FOO<rb>BAR<rt>baz1<rt>baz2<rtc>qux1 qux2</ruby>

As you can see, the only difference is that the <rtc> contains a
single (implicit) <rt>, rather than two <rt>s.  It seems plainly
obviously that this is simpler for authors; it's also simpler for
implementors, because we don't have to infer that we should be
formatting something as double-ruby from the presence of nested <ruby>